Console Servers
$Revision: 1.7 $
$Date: 1999/11/19 15:52:17 $GMT
There are 2 console servers for the D0En cluster, d0ensrv3 and
d0ensrv5. Normally there would only be one console server per Enstore
system, but since there are PC nodes on both sides of the AML/2 robot,
it was easier have 2 console servers. Each console has full
functionality for the entire system. For example, a console for any
Enstore server or mover PC can be gotten from either console server.
The console servers have these major purposes:
- Provide access to the Enstore PC root consoles via its ttyS0.
To get a console:
- Log in as root to one of the console servers.
- If you are running X11, type "cons nodename"; for
example "cons d0ensrv2" or "cons d0enmvr11a". Make sure you
allow console server xhost access to your local node.
- If you are not running X11, then type "console nodename"
and the software runs in your current environment.
- If you actually typing on the console monitor, you can
use the start menu or right-button menu to do the same thing.
- Provide access to the Enstore PC boot bios via its ttyS1.
- Getting a console for the boot bios is the same as the
kernel consoles, except you attach "bios" to the node
name. For example, "cons d0ensmvr11abios".
- The boot bios screens are only active when the PC is
booting.
- Eventually, it will be possible to reset the node using
the serial reset capabilities of the BMC over the bios
windows.
- Provide backup storage (disk and tape) for primary pnfs and
Enstore databases and other important information. This only runs on
d0ensrv3 for now.
- Once an hour, the pnfs databases are copied from
d0ensrv1 to /diska/pnfs-backup. Files are kept on disk for
approximately 1 week.
- Once an hour, the Enstore databases are copied from
d0ensrv1 to /diska/enstore-backup. Files are kept on disk
for approximately 1 week.
- Once an hour, the entire system disk (except for
unreadable files
MPTN/ETC/TRUSERS|OS2/SYSTEM/SWAPPER.DAT|tcpip/DOS/ETC/TRUSERS)
from the AML/2 OS/2 node adic2 is mirrored to
/diska/aml2Shadow. This provides a simple way to see what is
on the adic2 node as well as a backup.
- Every 15 minutes, the Enstore log files are copied from
d0ensrv2 to /diska/enstore-log-backup. These files are not
critical to keeping the system running, but are our source
of what happened. They are copied more frequently than the
critical files because this is the only copy we have (the
d0ensrv1 databases are on mirrored and duplicated disks).
Files are kept on disk for approximately 1 week.
- Twice a day, at 7 AM and 7 PM, a backup of files,
(listed in the preceding bullets) currently on disk is made
to tape. There is a full backup Monday and Thursday
mornings. [This is a sticky flag, a full backup is
attempted until it is successful.] The other backups are
incremental since the last full backup.
- The tape drive is a local (not in the AML/2
robot) mammoth-1 drive.
- Tapes get full, and then need to be replaced,
about once a month.
- Enstore's volume-import procedure is used to
create the backup of the disks (not fmb). This
allows us to insert the tape, if we need to, into the
AML/2 robot and read the information with
Enstore. The volume import information resides in
/diska/BackupTapeDB.
- Individual files are not copied to tape. Rather
a tar container of each major section is created
first and that is what is backed up to tape. These
tar containers are created in /diska/BackupToTape.
- Provide a netscape browser for examining Enstore's current state.
- The easiest way to get the browser is log in as enstore
and type "netscape". Don't run netscape as root. Log in as
enstore instead.
- If you are actually typing on the console monitor, you
can use the start menu or right-button menu to do the same
thing.
- Provide a logical place to control or monitor Enstore. The
console servers are the nodes people should log into and not any of
the other ones.
To monitor/control the other nodes:
- Log in as enstore (not as root!)
- Get the correct environment "setup enstore"
- To issue an Enstore command: "enstore command <
farmlet > < options >". For example, "enstore EPS"
will list the Enstore processes on all Enstore nodes;
"enstore EPS d0enmvr" will list the Enstore processes on
just the mover nodes; and "enstore EPS d0ensrv1" will list
the Enstore processes on just d0ensrv1. See other sections
for more details on controlling Enstore.
- There is a local mammoth-1 tape drive that allows administrators
to examine and/or copy tapes. This allows detailed investigation
outside the AML/2 and without affecting the users.
Cron jobs
. executing on d0ensrv3:
- user enstore - aml2mirror. If this fails, the most likely reason
is that the adic2 OS/2 computer is not working or the network
connection to it is unavailable.
- user enstore - backup2Tape. If this fails, the most likely reason
is that the tape is full or damaged and needs to be replaced. You will
get mail warning you the tape is almost filled and needs
replacing. You'll also get mail when the tape drive cleaning bit is
set and needs to be cleaned.
To replace a tape, put a new labelled tape into the drive and
- log in as root to d0ensrv3
- type the following:
- TAPE=/dev/rmt/tps0d4n; export TAPE
- TAPE_DEVICE=$TAPE; export TAPE_DEVICE
- TAPE_DB=/diska/BackupTapeDB; export TAPE_DB
- setup enstore
- $ENSTORE_DIR/volume_import/enstore_tape --init --erase --volume_label= < NEW_LABEL >
- edit $TAPE_DB/CURRENT_TAPE and enter the < NEW_LABEL > of the replacement tape.
- user root - perm. If this fails, look in /diska/pnfs-backup for
"peculiar" things, such as strange file ownership or permissions.
Report it to an Enstore developer.
- user root - tarit. If this fails, the most likely reason is
because either rip8 (the backup node for the ide system disk) or the
network connection to rip8 is down.
Console Software
- The software was gotten from http://www.gnac.com/conserver/.
This is a more up-to-date version of the console software typically
distributed in the standard Fermi console server distribution.
- The escape sequence is "^Ec < command_letter > "; "h" for the
command letter gives you a list of commands.
- The logs of all traffic on the console and bios screens are saved
in separate log files in the /var/log/conserver directory. (Note, the
logs are not on a separate disk called /usr/farm as is done in the
standard console server install.) The console server log of itself is
in /var/log/conserver.log.
- Occasionally a port seems to get "stuck" and you can't
communicate with it. Usually if you send a ^Q, the port will
clear. If that doesn't work, type "cylines -r" and check for errors.
- The console software is started on bootup. To do it by hand,
type "/etc/rc.d/init.d/conserver stop" or "/etc/rc.d/init.d/conserver
start".
Bootup
- The console server software is started.
- FTT makes /dev/rmt scsi devices.
- The BMC watchdog is armed and deadman is started at real-time
priority.
- You need to start the window manager manually by logging in and
typing "startx".
Some Details:
- A Cyclades Cyclom-YeP Multiport Module provides the serial
connectivity to various ports.
- For reasons I don't understand, the linux cyclades kernel appears
to have to be built on d0ensrv3 and d0ensrv5 before it recognizes the
cyclades hardware. This seems to mean that the kernel make process probes
the hardware, which seems very unlikely. In any event, right now the
linux kernel needs to be remade (without changing any options in the
config file) on each console server node. This is a small problem not
worth worrying about.
- The console server nodes can be booted without affecting normal
Enstore operations. You might get some errors from the cron jobs when
you boot the node.
- The xscreensaver (http://www.jwz.org/xscreensaver) software is
running in the background. This software helps identify the console
server on its monitor.