script
//
// jwasilko@gmail.com
// fishworks' cli user interface doesn't provide a good way to monitor
// disk space of all projects. This is an attempt to make up for that.
//
run('shares');
projects = list();
printf('%-40s %-10s %-10s %-10s\n', 'SHARE', 'AVAIL', 'USED', 'SNAPUSED');
for (i = 0; i < projects.length; i++) {
run('select ' + projects[i]);
shares = list();
for (j = 0; j < shares.length; j++) {
run('select ' + shares[j]);
share = projects[i] + '/' + shares[j];
used = run('get space_data').split(/\s+/)[3];
avail = run('get space_available').split(/\s+/)[3];
snap = run('get space_snapshots').split(/\s+/)[3];
printf('%-40s %-10s %-10s %-10s\n', share, avail, used, snap);
run('cd ..');
}
run('cd ..');
}
Wednesday, November 2, 2011
Monitoring disk space usage on Sun Fishworks (7000) ZFS Storage Appliances
We needed a way to have our existing monitoring system alert us if a project was running out of space. There's not a single CLI command that will show all projects, but this bit of ECMAscript 3will output an easily parsed table:
Tuesday, March 22, 2011
Celerra datamover group file doc bug
We're testing NFSv4 which requires a user/group database (either local files or LDAP/NIS) on the datamover.
Username/UID mapping was working properly, but group/GID mapping was not.
The Celerra Naming Services (6.0) doc on page 21 lists the format of the group file as:
groupname:gid:user_list
But the proper format includes a field for the group password:
groupname:password:gid:user_list
The password field is often blank (x).
Hope this helps someone else avoid the hassle we ran into.
The Celerra Naming Services (6.0) doc on page 21 lists the format of the group file as:
Monday, March 7, 2011
Celerra top talkers & suspicious ops defined
The EMC Celerra datamovers have the ability to log statistics about top talkers, which can be useful for tracking down problems. We run server_stats with these options to get top talker stats:
/nas/bin/server_stats server_2 -top nfs -i 5 -c 60
One thing worth noting is there's a column labeled "NFS Suspicious Ops". There's no documentation on this column, and it took EMC some time to dig up the answer. Here it is:
SUSPICIOUS EVENTS:
One of the TopTalker output columns lists Suspicious Ops/second.
"Suspicious" events are any of the following, which are typical of the patterns seen when viruses or other badly behaved software/users are attacking a system:
CIFS events:
NFSv2/v3/v4 events:
/nas/bin/server_stats server_2 -top nfs -i 5 -c 60
One thing worth noting is there's a column labeled "NFS Suspicious Ops". There's no documentation on this column, and it took EMC some time to dig up the answer. Here it is:
SUSPICIOUS EVENTS:
One of the TopTalker output columns lists Suspicious Ops/second.
"Suspicious" events are any of the following, which are typical of the patterns seen when viruses or other badly behaved software/users are attacking a system:
CIFS events:
- ACCESS_DENIED returned for FindFirst
- ACCESS_DENIED returned for Open/CreateFile
- ACCESS_DENIED returned for DeleteFile
- SUCCESS returned for DeleteFile
- SUCCESS returned for TruncateFile (size=0)
NFSv2/v3/v4 events:
- NFSERR_ACCES returned for NFS OPEN/LOOKUP/CREATE/DELETE
- NFSERR_ACCES returned for READDIR/READDIRPLUS
- NFS_OK for NFS REMOVE
- NFS_OK for NFS SETATTR (size=0)
Saturday, January 1, 2011
Monitoring share-based replication on Sun Fishworks (7000) Appliances
We use Sun/Oracle's Fishworks (7000) ZFS Storage Appliances to store our Oracle archive logs and to replicate them to our DR datacenter.
We generate more than 2TB of archive logs per day, and ZFS' compression helps knock that down to a somewhat more manageable 500GB a day. Initially we were using project-based replication which was easy to configure, but unfortunately there was not enough parallelism to keep up with our change rate.
Sun suggested setting up replications for each share (we have 16 shares per database cluster) to improve throughput. It's worked well, but the user interface didn't provide an overview of replication status.
Fortunately, the CLI can be scripted using JavaScript, so it was easy to loop over the projects and shares and extract the replication status.
To run the script, just ssh to the appliance and redirect stdin from the script:
The script is below. I hope it might be useful for you.
We generate more than 2TB of archive logs per day, and ZFS' compression helps knock that down to a somewhat more manageable 500GB a day. Initially we were using project-based replication which was easy to configure, but unfortunately there was not enough parallelism to keep up with our change rate.
Sun suggested setting up replications for each share (we have 16 shares per database cluster) to improve throughput. It's worked well, but the user interface didn't provide an overview of replication status.
Fortunately, the CLI can be scripted using JavaScript, so it was easy to loop over the projects and shares and extract the replication status.
To run the script, just ssh to the appliance and redirect stdin from the script:
ldap1{jwasilko}64: ssh sun7310-1 < replication_status
Pseudo-terminal will not be allocated because stdin is not a terminal.
Password:
Current time: Sun Jan 02 2011 02:35:06 GMT+0000 (UTC)
Share LastSync LastTry NextTry
db/archivelogs_rman10 Sun Jan 02 2011 02:25:13 GMT+0000 (UTC) Sun Jan 02 2011 02:25:13 GMT+0000 (UTC) Sun Jan 02 2011 02:55:00 GMT+0000 (UTC)
db/archivelogs_rman12 Sun Jan 02 2011 02:26:13 GMT+0000 (UTC) Sun Jan 02 2011 02:26:13 GMT+0000 (UTC) Sun Jan 02 2011 02:56:00 GMT+0000 (UTC)
db/archivelogs_rman14 Sun Jan 02 2011 02:27:13 GMT+0000 (UTC) Sun Jan 02 2011 02:27:13 GMT+0000 (UTC) Sun Jan 02 2011 02:57:00 GMT+0000 (UTC)
db/archivelogs_rman16 Sun Jan 02 2011 02:28:13 GMT+0000 (UTC) Sun Jan 02 2011 02:28:13 GMT+0000 (UTC) Sun Jan 02 2011 02:58:00 GMT+0000 (UTC)
db/archivelogs_rman2 Sun Jan 02 2011 02:21:21 GMT+0000 (UTC) Sun Jan 02 2011 02:21:21 GMT+0000 (UTC) Sun Jan 02 2011 02:51:00 GMT+0000 (UTC)
db/archivelogs_rman4 Sun Jan 02 2011 02:22:13 GMT+0000 (UTC) Sun Jan 02 2011 02:22:13 GMT+0000 (UTC) Sun Jan 02 2011 02:52:00 GMT+0000 (UTC)
db/archivelogs_rman6 Sun Jan 02 2011 02:33:18 GMT+0000 (UTC) Sun Jan 02 2011 02:33:18 GMT+0000 (UTC) Sun Jan 02 2011 03:03:00 GMT+0000 (UTC)
db/archivelogs_rman8 Sun Jan 02 2011 02:24:13 GMT+0000 (UTC) Sun Jan 02 2011 02:24:13 GMT+0000 (UTC) Sun Jan 02 2011 02:54:00 GMT+0000 (UTC)
The script is below. I hope it might be useful for you.
script
//
// jwasilko@gmail.com
// fishworks' user interface doesn't provide a good way to monitor
// the health of share-based replication. this is an attempt to make
// up for that.
//
print("Current time: " + new Date());
printf('%-30s %-40s %-40s %-40s\n', "Share", "LastSync", "LastTry", "NextTry");
// Get the list of projects, to iterate over later
run('shares');
projects = list();
// for each project, list the shares
for (projectNum = 0; projectNum < projects.length; projectNum++) {
run('select ' + projects[projectNum]);
shares = list();
// Walk into the share and select replication, then actions
for (sharesNum = 0; sharesNum < shares.length; sharesNum++) {
try { run('select ' + shares[sharesNum]) } catch (err) { dump(err); }
share = projects[projectNum] + '/' + shares[sharesNum];
run('replication');
actions = list();
// Some shares may not have share-specific replication actions,
// so skip if needed. Otherwise, get the replication status
if ( actions.length > 0 ) {
for (actionsNum = 0; actionsNum < actions.length; actionsNum++) {
try { run('select ' + actions[actionsNum]) } catch (err) { dump(err); }
lastsync = run('get last_sync').split(/=/)[1];
lastsync = lastsync.replace(/\n/,"");
lasttry = run('get last_try').split(/=/)[1];
lasttry = lasttry.replace(/\n/,"");
nextupdate = run('get next_update').split(/=/)[1];
nextupdate = nextupdate.replace(/\n/,"");
printf('%-30s %-40s %-40s %-40s\n', share, lastsync, lasttry, nextupdate);
}
run('cd ../..');
}
else {
run('cd ..');
}
run('cd ..');
}
run('cd ..');
}
Subscribe to:
Posts (Atom)