Cass Bonnette

SUBJECT: Control system rebuild project cbdebug.hlp

TITLE: TCS software debugging tools

DATE: 24 August 1994

AUTHOR: W. Cruise REVISION: 1.0


Introduction

This paper presents many of the tools available for debugging the cass bonnette sub-system of the Telescope Control System. It is assumed that the user is very familiar with the basic TCS commands for the bonnette, and that he/she has a reasonable familiarity with the bonnette control system hardware and the Galil controller. This paper is not a tutorial on troubleshooting, but is intended to explain the use of the tools available for troubleshooting. The examples given in this paper may be entirely made-up, and have no relationship to actual command sequences which might be useful in real troubleshooting situations.

Because of the antique nature of the HP 1000 computer, and because of the general lack of familiarity with this beast, the paper goes into significant detail on the use of the HP 1000 facilities. If any user wishes to learn more about the HP 1000, and its RTE-6/VM operating system, there are many manuals around which can be helpful in studying this system.

The following are just some of the ways the TCS can be used in troubleshooting the cass bonnette control system. Many of the methods require that the TCS be logged in and operating on another terminal, some require that the user be logged into the TCS on his terminal, while some require a quiet computer, without any other users competing for the bonnette resources. Many of the test facilities can be used in several ways, but operate about the same no matter how they are started. This list describes most of the ways you can interact with the bonnette system.


Bonnette commands

These are the user level commands which are available through the normal TCS session. They should be the first line of troubleshooting, and the last item on the checkout. These commands are covered in detail in the TCS Operator's Manual[1], and in the Cass Bonnette User's Manual.

CBENG program

CBENG is short for Cass Bonnette ENGineering. It is a FORTRAN 77 program which displays a lot of cass bonnette information, and can also accept pass-through commands to the Galil controller. It can be used to get either a snapshot of the bonnette status, or can be left running to provide a near real-time display of what the bonnette system is doing. CBENG gets its information from the main bonnette control program CBCTL. It does not perform any I/O directly to the Galil controller. Therefore the normal bonnette software must be running for CBENG to be useful

CBENG can be operated in several forms, and their differences are explained below. This first section covers the common aspects of using CBENG.

CBENG display

The CBENG program display should look something like the screen dump shown below, in figure . Improvements may change the display a bit, but the basic format and data will probably remain close to this version.


CBENG display format
    physical     user -HssH+     inc        abs  stop      velocity
 X  -171.399 -179.999       -1599964    2109336     1       5.00000
 Y  -103.500 -100.000       -1999986   10054753     1       5.00000
 Z   401.002                       0   10927979     1

    phys off  cal off user off
 X -2231.297   -8.600    0.000     Galil I/O:     0, err:    0
 Y -9922.595    3.500    0.000     I/O stats:    123456,    0,    0
 Z 11072.856    0.000

 Filter:     7

ND: 1

CM Rot: 6 TCS mirror pos: 0

CM Trans: 1 Limits: park (+)

Probe: 1

Plate: 1

On plate: 1

Y chain: 1

GO XY: 1

GO Trans: 1

over ride: 1


Axis values:

The first nine lines have information about the X,Y, and Z axes of the bonnette. The relevant axis is identified by an X, Y, or Z on the left side of each line. The parameters are identified by labels above.

physical

This is the physical position of the axis, in millimetres. The physical coordinate system should very closely match that of the original bonnette control system. All the limits are specified in this system.

user

This is the position in millimetres where the end users work. It is the combination of the physical position, the calibration offset, and the user offset. This is the position which is displayed on the "touch panel" display, on the Telescope Operator's terminal, and on any optional displays on the data acquisition system.

- HssH+

This set of columns, which should normally be blank, displays indicators for all the axis limits. If a limit is activated, a mark appears on the X, Y, or Z line under the 'H' or 's' in the heading. If it is on the left side, which is identified with a '-', it is a negative limit. Also, the mark will be either a '-' or a '+' to further identify whether a limit is positive or negative.

Normally only the soft limits will get hit. In some cases the system might get onto a hard limit, and there should be two -'s or +'s showing that both the soft and hard are activated. If the soft fails, then only the hard would be on. There should never be a case where the hard limit is on, and the soft is not. If this happens, there has been a hardware failure.

inc

This is the raw incremental encoder reading from the Galil controller. It is not used in the control software, but is presented as a troubleshooting aid. It may be set to 0 at any point to help in this work by using a Galil reset or DP command.

The encoder scale factor for X & Y is 20,000 counts/mm. For the Z axis it is 8571.42857142 counts/mm.

abs

This is the raw absolute encoder reading. The numbers are quite large because these are multi-turn encoders, and they are nearer the center of their range than the end.

The conversion from raw encoder reading to a position in user coordinates is raw reading/1024 + physical offset + calibration offset + user offset. All the offsets are in millimetres.

stop

This is the current Galil stop code, returned by the SC command. It is read every loop in the control program. It provides some very useful information about the current state of the Galil controller. Some of the codes which occur on this system are:

	0	motors are running, independent mode

1 motors stopped at commanded independent position

2 decelerating or stopped by FWD limit switches

3 decelerating or stopped by REV limit switches

4 decelerating or stopped by Stop Command (ST)

6 stopped by Abort input

7 stopped by Abort command (AB)

8 decelerating or stopped by Off-on-Error (OE1)

9 stopped after Finding Edge (FE)

10 stopped after Homing (HM)

velocity

This shows the current command velocity for the axes. If the axis is in slew mode, then it is the velocity in mm/sec at which the axis will move when commanded. If the axis is velocity mode, then the displayed velocity is the actual velocity at which the axis is moving.

phys off

The physical offset is the distance in millimetres which must be added to the raw encoder reading to produce a position measurement equivalent to the readings of the old bonnette control system. This system is termed the physical coordinate system.

This is the number which must be adjusted if the encoder is replaced, or otherwise disturbed. The physical calibration references are included in the cass bonnette system documentation.

cal off

The calibration offset is the distance in millimetres which must be added to a position in the physical coordinate system to produce a position referenced to the present day bonnette. For the X and Y axes, 0,0 is defined as the position which puts a star at the center of rotation of the cass environment onto the TV camera at a position of 128,128 on the Leaky Memory. For the Z axis, the TV camera should be in focus when its position equals the distance from the bottom plane of the bonnette to the focal plane of the telescope below the bonnette.

user off

The user offset permits individual users to redefine the origin of the XY stage coordinate system. It is reset to zero every time the bonnette program is restarted. There is no user offset for the Z axis.

Galil I/O

This is an indication of the quality of the I/O between the computer and the Galil DMC-740 box. Each interchange between the two computers, and there are several per second, should produce a ':' character returned from the Galil. If nothing comes back from the Galil, or if a colon is not the first character, this value is decremented. If the value is less than zero, and a good read occurs, the value is incremented. However, it never goes above 0.

Thus, 0 represents good I/O, and increasing negative values indicate I/O problems. If the number goes below -6, the control program returns error codes to the host computer when it attempts bonnette operations. The value is restrained so that it will not go below -18.

Galil err

This is a rather static field, which holds the error code returned from the Galil after the last command was attempted. Normally it should be 0, indicating that all is well. Some of the error codes returned by the Galil are:

	1	unrecognized command

3 command not valid in program

4 operand error

6 number out of range

7 command not valid while running

8 command not valid while not running

9 variable error

14 EEPROM check sum error

16 IP incorrect sign during position move or IP given during forced deceleration

17 command not valid while program running

20 Begin not valid with motor off

21 Begin not valid while running

22 Begin not valid because of Limit Switch

I/O stats

This section has three values which give an idea of the overall I/O performance of the Galil to TCS III communications. The first number is the total number of I/O messages sent to the Galil. It should increment at about 6 per second. The second number is the number of individual read failures which have occurred. The third number records the number of times the system has been in failure mode, which is defined as a -6 in the I/O counter, or 6 consecutive read errors.

Basically the first number should count and the other two should be zeros. However, in practice there are usually one or two errors showing in the individual read fail value. The system fail counter stays zero. Typically, the communications works flawlessly, and there are no more occasional errors during operation. If the Galil controller fails, the failure counters start up, and keep incrementing until the controller is reset.

While these values were put in to monitor suspected communications problems, which in fact didn't exist, they are being left in to actually monitor communications quality. If anything odd is noted in the second and third values, it should be reported, and the cause investigated.

filter

This is the standard filter number, from 1 to 7. It may show a 99 when the filter is moving, or is stopped between positions.

ND

This is the standard neutral density filter number, from 1 to 4. It may show a 99 when the filter is moving, or is stopped between positions.

CM Rot

This is the rotation position of the central mirror, and may be from 1 to 7. It may also show a 99 when the mirror is moving.

TCS mirror pos

This field shows the number which would be returned by the TCS command MIRROR. It is a combination of CM Rot and CM Trans. It works as follows:

	CM ROT	CM TRANS	TCS mirror pos	NOTE

n/c 1 0 park

n/c 2 -1 exchange

1-7 3 1-7 mirror in center

CM Trans

This is the position of the central mirror translation stage, and may be from 1 to 3, or read a 99 when the stage is moving between valid positions. The positions are:

        1	park

2 exchange

3 center of the field

Limits:

This field displays the limit status of the central mirror stage. It will be blank if the stage is not on a limit, or may display "park (+)" or "center (-)" if the stage is on a limit.

Probe

This field indicates which TV pickoff mirror is mounted on the front end of the XY stage. The valid mirror numbers are:

        0	no mirror is mounted

1 small visible mirror

2 large visible mirror (seldom used)

3 IR mirror (never used)

4 extended path IR mirror

It may also show a 99 if there is some error reading the mirror bits, or if an illegal combination of bits is set.

Plate

This is the protection plate which provides an additional layer of hardware protection to prevent collisions of the bonnette parts. Its numbers are the same as used for the mirror probe.

On plate

This is a binary value indicating whether the probe on plate signal (PPL or Xysafe) is true. When the proximity sensor is near the plate this will be a 1. Otherwise it is 0.

Y chain

This is a binary value indicating whether the Y stage chain tension signal (PYI) is true. When the chain is properly tensioned, and the tension arm is near the sensor, this will be a 1. Otherwise it is 0.

GO XY

This is a binary value indicating whether the SID card security signal permitting XY stage motion is true. If it is a 1, XY stage motion is permitted. A 0 indicates motion not permitted due to one or more causes.

GO Trans

This is a binary value indicating whether the SID card security signal permitting central mirror translation motion is true. If it is a 1, motion is permitted. A 0 indicates motion not permitted due to one or more causes.

override

This is a binary value indicating whether the manual override switch is depressed. If it is a 1, the switch is depressed.

cass bonnette control program has died

This is an optional message which can occur on the same line as the Z axis offsets. Its appearance indicates that the main bonnette control program was not running during the last three CBENG screen updates. The control program updates a heartbeat counter each time it runs. CBENG checks this heartbeat each time it updates the display. If it is identical for three consecutive times, the message is put up.

Typically the message indicates that the main control program is either not running, or is stopped for some reason. This would be a good time to use the RTE "WH" command to determine the status of the bonnette programs.

It is also possible for the bonnette control program to get hung up by a prompt being left on the screen at the main TCS console. This is indicated on the WH display by CBCTL being suspended in state 2 on EQT 11 for the T.O. console, or EQT 15 or 16 for a remote terminal. The cure here is to just press ENTER on the console to get rid of the prompt.


Galil command pass through

CBENG has a feature which permits commands to be passed directly to the Galil DMC-740 for execution. This can be very useful in troubleshooting sessions as the results of the command can be seen in the CBENG display.

When CBENG is running and updating the display it is necessary to "break" the program to get its attention. How this is done varies with how CBENG was run in the first place. It always involves the RTE BR command, but the actual form will vary. Once CBENG receives the break request, it issues a "Enter Galil command or "EX" to exit:" prompt.

At this point anything entered other than "ex" or "EX" is sent directly to the Galil controller. Everything is converted to upper case, so it is OK to enter Galil commands in lower case. The program waits a short time for a response from the Galil, and then displays this exactly as received. It is possible to enter several Galil commands on one line, separated by the ";" character. Usually the results are correctly displayed. However, if the command execution takes too long, some of the final data may be missing. In this case shorten the command line next time.

Because CBENG must coexist with the normal bonnette control program, which is continually issuing information and control requests to the Galil controller, there are contention problems when performing pass through commands. This is handled by a handshake system which delays the next routine command from the control program so the pass through can occur without interference. This normally works well, but may on occasion permit a collision of commands from the two programs.

Due to the continual updating of the terminal screen by the CBENG program any other activity on the terminal can cause a real mess. Pressing the terminal's CLEAR DISPLAY key when the cursor is at the top of the screen clears everything below it, and CBENG then redraws the screen correctly. A little practice will get the timing right. It also helps to try timing command entry to occur when the cursor has paused at the bottom of the display between updates. Again, a little practice will help. No matter how messed up the display gets it should be possible to clear it up and get back to an operational mode.


using CBENG command in TCS session

This refers to the use of the CBENG command from the normal TCS command prompt. It operates a bit differently this way than when invoked from another session. When commanded this way it runs for only 15 repetitions, and then quits. This avoids having to kill the program explicitly. But, if it is necessary to kill it, the command OF,CBENG should work.

The Galil pass-through may be used when CBENG is run this way. To get CBENG to ask for a Galil command, first get a TCS command prompt and issue the break command, BR,CBENG. This causes CBENG to issue its "Enter Galil command or "EX" to exit:" prompt.


running CBENG from system prompt

This refers to running the CBENG program by issuing a system run command from the normal TCS prompt. This is done with the command RU,CBENG, or optionally, RU,CBENG[,time[,repetitions]]. The first form runs CBENG at max rate forever, and the second form permits entering the delay time in seconds, and the number or repetitions to run. If it is necessary to kill the program, the command OF,CBENG should work.

If the RU command elicits a "NO SUCH PROG" message, then use the CBENG command once, as described above. This automatically installs the program in the OS.

The Galil pass-through mode operates the same as if CBENG were executed with the CBENG command.


CBENG command from private session

This refers to running the CBENG program from a terminal other than the one running the TCS or bonnette DILOG session. This is done with the command RU,CBENG, or optionally, RU,CBENG[,time[,repetitions]]. As with all RTE program execution, the "RU," is optional.

For CBENG to work, either the TCS or bonnette session must be running on another terminal. If it is not, CBENG will just report on what was left in memory at the end of the last TCS session. After three loops the "cass bonnette control program has died" message will be displayed if no cass bonnette control program is running.

To exit CBENG get a system break mode prompt and enter OF. To use the pass through feature, get a system break mode prompt and enter BR.


Galil command pass through

The basic bonnette software also has a command to permit commands to be passed directly to the Galil DMC-740. The command is "CB", and it takes no parameters. It issues a prompt to "Enter Galil command:". This feature works in the same way as the pass through feature of CBENG.

Anything entered is sent directly to the Galil controller. Everything is converted to upper case, so it is OK to enter Galil commands in lower case. The program waits a short time for a response from the Galil, and then displays this exactly as received. It is possible to enter several Galil commands on one line, separated by the ";" character. Usually the results are correctly displayed. However, if the command execution takes too long, some of the final data may be missing. In this case shorten the command line next time.


Debug printouts

The cass bonnette control program has a lot of debugging printouts still installed. They can be triggered by an on-line command, DEBUG n, which must be issued at the TCS command prompt. For the most part increasing numbers result in increased printouts, and include everything that would be printed out by a lesser number. Using the command may slow down execution of the system, and will certainly clutter up the screen. There are a few things which come out that may be of use to a non-programmer type debugger.

The following printouts are currently set up.

        0   Turns off all debug messages

8 Velocity mode data every 4 loops, including rate statistics

Reports all command strings launched to Galil

9 Additional velocity mode data every loop, including steps taken

Additional info on command launches and status

11 Max output every loop, which includes:

many Commands structure values

hex values of all SID card I/O buss addresses

XYZ absolute positions

Galil stop codes


Galil reset

The TCS control program has a dedicated command, @RESET, to issue a reset command to the Galil controller. Of course the use of this command requires that the communications to the controller be working.