Studying RSX11M source code I found out that the original RP11
controller (vs. -C and -E later versions) was still supported
in the software, and notably UMD (User Mode Diagnostics) was
still capable of handing it and dealing with its testing,
even in much later OS versions (compared to when the original
RP11 was phased out).
The change in this commit basically implements that flavor
(which only supported the RP02 drives), and also makes a few
minor fixes / cleanups for the device code, here and there.
The default mode for the RP11 controller remains the more widely
used -C/-E version, but if so desired, it can be downgraded to
support its predecessor with the "SET RR <type>" command now.
Recent analysis of the 2.9BSD kernel revealed that RP11 was
expected to interrupt on control RESET function if IE bit was
also set. Documentation was not very clear of the fact, saying
in one place that RESET+GO does not interrupt (which is not
contradictory with the above because it does not mention IE).
In other place, however, it says that IE always causes interrupt
when DONE is asserted. Thus, since RESET does assert DONE, an
interrupt should be posted if IE is set. The autoconfig binary
from 2.9BSD uses this feature of RP11 to check the presence
of the controller.
Formerly RESET was always clearing RPCS with DONE unconditionally,
and that reset IE as well. This patch makes sure that the IE bit
is preserved, and if set, it posts an interrupt when RESET asserts
DONE.
It looks like disk controllers, which automatically update
disk address (DA) after completion of I/O, are expected to do
so even if there was no data transfer because of I/O errors.
I was studying RSX-11's Error Logger documentation and
examples are clearly offsetting disk addresses backwards
by one when I/O errors are reported by the controller.
Since once the controller has found the DA-specified sector,
the I/O begins regardless of the condition of the sector (bad
or good data) or ability to transfer the contents between the
disk and the memory. If an error occurs (NXM, for instance)
the operation would stop (with the error reported) at the end
of the sector. So if, for example, the bus address register
had a bad address from the get-go, no data would be able to
transfer at all, yet DA should still be updated with DA + 1
once the controller asserts the DONE bit.
This patch makes sure that DA is always advanced when I/O has
actually been commenced.
Running earlier XXDP tests revealed that a technique of concurrent command
initiation and continued housekeeping for the command completion was used in
the old code.
For example, code could initiate a SEEK command for a drive, and knowing that
CS_DONE (and thus, an interrupt) is coming in about 16us, it would then go
ahead and clear a flag, which registers that the interrupt has occurred
(expected to be set to 1 by the ISR). If CS_DONE is set by the implementation
at the function initiation immediately, that would mean that the interrupt
could be triggered before the next instruction, and the flag would be set by
the ISR right away. The main code, however, would proceed with the the flag
clear as the following instruction, thus, never detecting the interrupt down
the road.
Since this technique was in existence, it is better to introduce a delay for
setting CS_DONE in the "fast" initiation commands like SEEK and HOME, to
accommodate the software that was relying on it.
So far, however, no issues were encountered in testing (except one), where
this delay mattered, but it's hard to tell if it would not be needed at all.
All I/O commands always delay CS_DONE already because they were never supposed
to be immediate.
Since the time for CS_DONE in initiation commands was documented at 16us, the
introduced delay is set to 10 instructions, which usually took more than that
to execute. But the interrupt flag clear case would be covered, as well as
the counted waits, which used some 25+-iteration tight loops for "drive ready",
before flagging a time-out (so the delay cannot be longer, either).
It also looks like more modern code never used any such tricks, so for it, it
should not matter if CS_DONE was slightly delayed or not.
Having run the device code thru XXDP and some other OS's and scenarios
rigorously, a bunch of discrepancies were found, which need to be addressed
by this rather extensive patch.
1. Each unit must implement its own "drive status" register, to be able to
track per-drive errors / conditions correctly;
2. Fixed INT_SET() / INT_CLR() in RPCS write function (wrong order of the "if"
conditions);
3. Some behavior was implemented not exactly how it was expected from the real
hardware, such as:
a. Post-I/O register values in RPDA and RPCA (including the corner case of
pack overflow);
b. I/O stacking, which wasn't mentioned in any available documentation, but
only XXDP listings;
c. RESET/IDLE function must be accepted for a "busy" controller;
d. HOME function must always execute, even when "device ready" is not set
(e.g. when SEEK error detected);
e. SEEK incomplete should not respond with "device ready" (however, the
condition can be cleared by HOME, d.);
f. WLOA-induced write-lock violation wasn't reflected in "device status".
4. Some timing was off so that the device worked "too fast" -- this was fixed
(except for the pathological cases when the races are in the actual test
code, and cannot be logically fixed);
5. WLOA setup command bug was fixed;
6. Added more code comments found per the above peculiarities.