The venerable Voyager 1 spacecraft is experiencing another glitch. Instead of sending science and engg. data, it is sending a 0101 bit pattern.
The problem has been narrowed down to the flight data system (FDS), which is not communicating properly with the telecom unit (TMU). A reboot did not help.
Stay tuned as NASA engrs work out a fix for this 1970's era computer, which has performed magnificently during its long 46-year journey to the planets and to outer space.
https://blogs.nasa.gov/sunspot/
1/n
The two Voyager spacecraft, launched on Sept 5, 1977 and Aug 20, 1977, have been traveling in space for over 46 years.
Voyager 1 is farther away from earth at 24.3 bil km (22.5 light hours), while V2 is 20.3 bil km away, located below the ecliptic. Both spacecraft are in interstellar space.
Here are the locations and some vital stats on the two Voyager spacecraft.
You can follow the real-time status of Voyager at https://voyager.jpl.nasa.gov/mission/status/
Graphic source: https://www.nasa.gov/solar-system/nasas-new-horizons-reaches-a-rare-space-milestone/
#Voyager
2/n
Voyager's computer systems were custom-built using 1960s technology, with clock speeds measured in KHz and RAM in kbytes, running hand-crafted software, crammed into 4K of 18-bit wide plated-wire memory (similar to but better than core mem).
And yes, it uses digital 8-track tape for storage.
The custom-designed hardware, (upgraded) software and instruments are mostly still functioning after 46 years in space!
https://history.nasa.gov/computers/Ch6-2.html
https://hackaday.com/2018/11/29/interstellar-8-track-the-low-tech-data-recorders-of-voyager/
@NSFVoyager2
#Voyager
3/n
This schematic of the Voyager telecom system shows that the FDS sends data to the comm system over 2 serial interfaces - a low rate 10 b/s interface routed to the S-band transmitter and a variable rate 10 - 115.2 kb/s interface whose bits are sent via X or S band.
Also, from the 2 diagrams (this post and post #1), the outer coding (Reed-Solomon) is done in software!
What do you think might cause the data to be stuck not at 0 or 1 but at 0101?
https://descanso.jpl.nasa.gov/DPSummary/Descanso4--Voyager_new.pdf
@destevez
#Voyager
4/n
For those interested in failures and recovery in far away spacecraft, check out this thread in August, when Voyager 2 lost contact with earth due to a mispointed antenna (caused by operator error ).
https://fosstodon.org/@AkaSci/110831401826701180
#Voyager
5/n
Richard Stephenson of DSN Canberra explains on twitter how NASA verified that the uplink is working.
They sent a command to Voyager 1 to switch between non-coherent mode and coherent mode transmission.
In coherent mode, the Transmission clock is derived from the Rx signal instead of from the AUX oscillator. This changes the Tx RF frequency a bit which was detected at the DSN.
https://science.nasa.gov/learn/basics-of-space-flight/chapter10-1/
https://descanso.jpl.nasa.gov/DPSummary/Descanso4--Voyager_new.pdf
#Voyager
6/n
In the blog post at https://blogs.nasa.gov/sunspot/, Voyager engineers point out the difficulty in diagnosing problems and crafting solutions for a spacecraft with a signal round-trip-time of almost 2 days and hardware/software developed over 46 years ago using technology long since obsolete.
"Finding solutions to challenges the probes encounter often entails consulting original, decades-old documents written by engineers who didn’t anticipate the issues that are arising today."
#Voyager
7/n
NASA DSN in Goldstone, CA is currently receiving the downlink from Voyager 1 at a reduced rate of 40 bps. No uplink at this moment.
Apparently, Voyager 1 switched data rate (160 -> 40 bps) & did a full memory read-out of her Attitude and Articulation Control Subsystem, Flight Data Subsystem, and Command Computer Subsystems A&B.
Transmission time = 6 hours
Download size = ~108 kBytes
Here's hoping that the received data is not 0101...
A similar but not identical problem afflicted Voyager 2 in 2010. Received science data (but not engg data?) was garbled.
The problem was traced to a flipped bit in the program stored in the FDS. A command was sent to flip the bit.
The issue was diagnosed by downloading a full memory image, which implies that engg data download was working.
This is probably what was done today with Voyager 1 today. Hopefully, it is a similar problem.
https://voyager.jpl.nasa.gov/news/details.php?article_id=16
@destevez
#Voyager
9/n
NASA did not provide a date but it looks like this issue was discovered and acted upon on Dec 7 or 8.
The graphic below shows the schedule for Voyager 1 comms via DSN, generated on Dec 7. Normally, the downlink rate is 160 bps. On Dec 8, it was switched to 40 bps. And again on Dec 10. Some special commands for the FDS were also sent.
Since then, the D/L rate has been switched between 160 bps and 40 bps a few times with additional FDS commands uploaded.
https://voyager.jpl.nasa.gov/pdf/sfos2023pdf/23_12_07-23_12_25.sfos.pdf
#Voyager
10/n
Two-way comms happening now between Voyager 1 and NASA DSN Canberra.
Of course, the results of the uplink commands will arrive 45 hours from now. The data arriving now left Voyager 1 22.5 hours ago.
Downlink rate is the lower 40 bps rate.
The DSN schedule for Voyager 1 shown below was modified and published yesterday.
Here's hoping that Voyager engineers are getting closer to a solution
https://eyes.nasa.gov/dsn/dsn.html
https://voyager.jpl.nasa.gov/pdf/sfos2023pdf/23_12_14-24_01_01.sfos.pdf
#Voyager
11/n
NASA JPL provided a minor update today about the status of the Voyager 1 spacecraft, indicating that the comm. problem that started more than 2 months ago has not been resolved yet. No other details.
Please check out the rest of this thread for more info on the problem where instead of sending science and engg. data, Voyager 1 has been stuck sending a 0101 bit pattern.
@NSFVoyager2
#Voyager
12/n
No new info on the status of the Voyager 1 spacecraft, which since Sep 2023 has been sending a 1010 bit pattern instead of real data.
Several popular science outfits have been covering it lately. A bit flip in the FDS is suspected, but it is difficult to identify since the memory cannot be read back.
Several commands were sent yesterday to Voyager 1; responses will arrive 45 hours later tomorrow.
Wonder why they cannot overwrite all prog and data memory.
https://arstechnica.com/space/2024/02/humanitys-most-distant-space-probe-jeopardized-by-computer-glitch/
#Space
13/n
Good news from the Voyager 1 spacecraft that has been stuck sending a 0101 pattern since Nov 2023.
The team has long suspected the root cause to be a corrupted area of memory in the FDS computer. On Mar 1, they sent some commands to make the FDS skip around sections of memory. The data stream rcvd 45 hours later looked different and was decoded to contain a read-out of the entire FDS memory!
Hopefully, they can now identify and fix the offending memory words.
https://blogs.nasa.gov/sunspot/
14/n
Voyager is not out of the woods yet, but the lesson for all of us is to never ever give up.
Here is the schedule for comms with Voyager 1 via NASA DSN this weekend. Some new commands will be sent on Friday, with responses expected 45 hours later on Sunday.
https://voyager.jpl.nasa.gov/pdf/sfos2024pdf/24_03_14-24_04_01.sfos.pdf
15/n
Some tech. info on the Voyager FDS computer –
- There was a backup FDS unit but it failed in 1981.
- Custom CMOS CPU - 36 instructions. 80 KIPS, 115 kbps data rate.
- 128 registers, kept in memory.
- CMOS memory, a first in space, 8KB.
- No separate memory for program storage vs execution. The CMOS memory is non-volatile kept powered on by the RTG.
- DMA access to memory by hardware. Instead of “cycle-stealing”, the instructions indicated cycles where DMA can occur.
https://ntrs.nasa.gov/api/citations/19880069935/downloads/19880069935_Optimized.pdf
16/n
Status update on the Voyager 1 spacecraft which has been sending a 0101 pattern since Nov 2023.
The problem seems to be a failed memory part in the FDS computer; engineers are planning to move ~200 words of software from one region to another, according to Joseph Westlake, director of NASA’s heliophysics division, who was speaking at a March 20 meeting of the National Academies’ Committee on Solar and Space Physics.
Westlake sounded very optimistic.
https://www.nationalacademies.org/documents/embed/link/LF2255DA3DD1C41C0A42D3BEF0989ACAECE3053A6A9B/file/D727AF88E8C806D7A1F75C8401AF9CF23BCCC2EC9F3A?noSaveAs=1
17/n
It looks like the Voyager team is preparing for a new "memory upload" to the FDS computer on Friday, as evident from the DSN schedule and instructions shown below for Voyager 1.
I am guessing that this is to rearrange the software so that it no longer uses the locations in the faulty memory chip in the FDS. If true, then hopefully we will hear Voyager 1's true voice on Sunday, 45 hours later. OTOH, this may be just one of many steps on the road to recovery.
https://voyager.jpl.nasa.gov/pdf/sfos2024pdf/24_03_28-24_04_15.sfos.pdf
18/n
Looks like the "memory upload" to the Flight Data Subsystem (FDS) on Voyager 1 is taking place at this time from the NASA DSN site in Canberra.
Go Voyager!
https://eyes.nasa.gov/dsn/dsn.html
https://en.wikipedia.org/wiki/Canberra_Deep_Space_Communication_Complex
19/n
It's been 6 hours since the "memory upload" data was transmitted to Voyager 1 from the NASA DSN site in Canberra.
During that time, the signal has traveled about a quarter of the way to Voyager 1, about the average distance to Pluto. The response will arrive at earth on Sunday around 1500 UTC (RTT = 45 hours).
Let's imagine a spacecraft sent to the nearest star Proxima Centauri, 4.2 light-years away. How would we diagnose problems and upload new software to it?
@AkaSci It’s coming back one day as V-ger.
@AkaSci Overwriting all prog memory is a good way to lose the spacecraft for good. A few bits wrong...
@TMEubanks
Systems I have been involved with always have multiple memory banks including one with a golden image, which cannot be overwritten and which can be booted if the current image(s) get corrupted. The golden image has reduced functionality but it supports telecom and image uploads.
Voyager probably did not have that luxury with its limited memory size.
@AkaSci I don't think that was done with 1970's spacecraft.
Note that cosmic rays can flip or hard set any bits anywhere, so what if the "golden image" gets corrupted?
@TMEubanks
Often, the golden image is in a memory type less susceptible to cosmic rays.
Plus, we have error correction and scrubbing.
@AkaSci simply amazing! Hope they can get it past the issues and continue on
@AkaSci holy shit, great to hear!!!!
@AkaSci this must be the furthest memory leak achieved to date, and maybe the slowest too.
@AkaSci ah the joy of debugging
@abesamma
Yes, debugging a corrupted program image where the corruption has disabled the ability to read back memory. 22.5 light-hours away. It takes the Right Stuff to debug it.
@AkaSci if there’s any piece of hardware desperately in need of a perfect simulator, it’s Voyager 1.
@camstonefaux @AkaSci there’s no hardware copy. As I understand it the voyagers predate that practice. I don’t know about a perfect virtual simulator, but it’s certainly possible.
@camstonefaux @AkaSci got a link?
@AkaSci @4raylee
Sort of…. can’t say I’m convinced it’s complete, and sadly the website where it was- appears to have been lost due to the possible death of the author…(?)
https://www.reddit.com/r/space/s/OB3NAI9q7M
@camstonefaux @AkaSci @4raylee nope, the Voyager probes predate this practice. I can't recall when they started doing it, but I want to say it wasn't until the era of the Spirit & Opportunity rovers that it was clear how valuable the practice is.
Not sure what the hardware that you linked further down is, but at most it could be a software simulation. Seems more likely that it's part of the infrastructure for mission control, though.
@SnoopJ @AkaSci @4raylee There were hints it was a h/w sim, but no references or other links- so I can’t say with authority there are. But if I was an engineer in the late 1970’s at NASA (I was in high school, and just started fooling with 6502’s)- I’d be sure the entire system was demonstrated & tested in a h/w brass board before committing to full on flight h/w.
@camstonefaux @AkaSci @4raylee sure, I'm not saying they didn't do ground-side testing of components.
But it isn't like today where if Curiosity or Perseverance run into a problem, they can roll out the 1:1 terrestrial copies (named MAGGIE and OPTIMISM, respectively) to do some debugging or try out some ideas.
@AkaSci @TomShafShafer This is mind blowing to me. Thank you for sharing.
Successfully patching a decades-old system that's 23 light-hours away and has no remote-hands capabilities for recovery is truly impressive, considering commercial OS vendors today can't seem to patch their bugs without causing 10% of their customers' machines to crash, brick, or boot-loop.
Thank you, Mr. W. Akshually.
@AkaSci this probably happened because, without the proection of the earth’s atmosphere, a Cosmic Ray flipped one or more bits in memory.
I sound like an expert because I am*. Here is the Cosmic Ray detection program I wrote. It is very high quality.
https://secretgeek.github.io/cosmic-ray-detector/index.html
(* intense sarcasm in this area)
Fingers crossed!
@AkaSci not compiling, but still... https://xkcd.com/303
@AkaSci What an incredible effort!
@AkaSci oh, good. everything being in RAM means there likely won't be a reduction in the probe's functionality, if they're able to bootstrap a fix. we were worried ROM was damaged and it couldn't execute from RAM or something like that.
@AkaSci sounds cool. Wish them good luck!
@AkaSci This is an amazing bit of troubleshooting (and by far the longest distance for tech support). Fingers crossed that this process works.
That would be highly symbolic, but I'll take it any time .
@AkaSci So... badblocks but for RAM? That's impressive.
@AkaSci wild to think of an entire code load, just flying through space. Transmission complete, it's just flying through space until a little antenna snags a bit of the energy.
@AkaSci If using a laser, I think it'll widen out so far and be so faint as to not be detectable without adding too much equipment for a small craft. It might make sense to create repeaters. It would still take 4.2 years to get there, but the craft would get a better signal regardless of the source, barring some breakthrough. With some AI in the repeater, it could initiate repairs. That would make common repairs possible without awaiting a signal from Earth. So, yes. With a little help.
@AkaSci with so many things going badly for humanity right now, it feels good to know we can still keep our most distant travelers working a little bit longer.