IBM TS3100 LTO-6 SAS error
Been running LTO-5 Library from IBM for 4 years with out any sorts of trouble,
but the suddenly the drive broke down.
Everyone suggested to step up to LTO-6, which we did,
the only changes we made was to update the library firmware to match the drive.
After this we are having trouble when writing “big” backups to Tape, see error below.
Every normal backup operation (mostly under 100GB each) works like a charm,
error free both in backup and verify.
When we have a script to a Tape media which includes around 1TB or more, backup stops with communication error.
Sometimes aftir 950GB, 600GB or even as low as 300GB and adds new tape to the media set.
In this particular instance we are backing up from Retrospect client version 12.5
It happens on both LTO-6 and LTO-5 tape media in the LTO-6 drive
No error in the logs for the Drive or the Library.
But, but, there is no error in writing the same set of data (1.6TB) BRU PE v. 3, on the other hand the verify process is rather slow, it started this morning and will probably finish on monday.
maybe that point to some underlying problem.
The big question is where I is the most likely place to start diagnosing
Is this maybe a known issue with LTO-6 and retrospect?
Or is it the:
HW backup server
What I have done so far:
SAS cable and ports
Firmware upd on the SAS controller
Increased cooling (fancontrol) on the MacPro
MacPro 1.1 Intel - 32 bit HW
OS X 10.7.5
6GB RAM -
Sys HD Mirror w. SoftRaid v. 4.53
SAS Controller Atto Express H380 driver 2.02, firmw June 2011
Software Retrospect 12.5 (111)
Library IBM TS3100- Firmw. D.00 / 3.20e - 4 years Old
Drive-ULT3580-HH6 SAS- Firmw. F9A1 - 6 months old
* Script: Gagnasafn_LTO6_A03_PP22
* Date: 2/10/2016 2:49 AM
* Errors: 1
* Warnings: 0
* Performance: 2562.4 MB/minute
* Duration: 09:54:27
* Server: Mubarak
+Normal backup using Gagnasafn_LTO6_A03_PP22 at 2/9/2016 4:54 PM (Execution unit 2)
To Backup Set Gagnasafn_LTO6-A03...
- 2/9/2016 4:54:46 PM: Copying PG_Lok22 on Cosimo
2/9/2016 4:56:43 PM: Found: 406116 files, 52271 folders, 1459.5 GB
2/9/2016 4:56:49 PM: Finished matching
2/9/2016 4:57:41 PM: Copying: 393217 files (1463.2 GB) and 0 hard links
stucFinished: [IBM|ULT3580-HH6|F9A1] incorrect scsiServiceResponse 0x1, scsiStatus 0x2
stucFinished: [0|0|0] transaction result 0x6
!Trouble writing: "1-Gagnasafn_LTO6-A03" (2353004544), error -102 (trouble communicating)
!Trouble writing media:
error -102 (trouble communicating)
2/10/2016 2:46:54 AM: Building Snapshot...
2/10/2016 2:46:54 AM: Checking 52271 folders for ACLs or extended attributes
2/10/2016 2:48:49 AM: Finished copying 52271 folders with ACLs or extended attributes
2/10/2016 2:48:59 AM: Copying Snapshot: 2 files (154.5 MB)
2/10/2016 2:49:08 AM: Snapshot stored, 154.5 MB
2/10/2016 2:49:08 AM: 1 execution errors
Completed: 393217 files, 1463.2 GB
Performance: 2562.4 MB/minute
Duration: 09:54:21 (00:09:39 idle/loading/preparing
ODDI Printing & packaging.
Please see my other note - we are looking into the verify performance issue, but it's very odd since the version of the BRU backup has not changed in some time.
CTO - TOLIS Group, Inc.
BRU ... because it's the RESTORE that matters!