Regular system maintenance is scheduled for every Saturday morning, from 8am to noon America/Denver (GMT-07:00). Typically, there should be no outages. But if we replace hardware or change network settings, then the service may be temporarily offline.
The last 22 maintenance updates:
Date
Planned
Outcome
2024-11-20
No planned outages
2024-11-16
System patching and maintenance.
No unexpected problems, total downtime about 5 minutes.
2024-11-09
Server update
Updated FotoForensics server code for better integration with CloudFlare. No issues.
2024-10-28
Unplanned DDoS
At 1am MDT (07:00 GMT), a second DDoS wave hit. We have moved to a different submet and are using CloudFlare to protect the server. We now believe this was a nation-state attack. Although network connectivity was degraded, the server was not compromised.
2024-10-26
Unplanned Outage
At 10am MDT (16:00 GMT), a misconfiguration at CloudFlare caused the site to be unreachage. Moreover, a bug in CloudFlare's configuration prevented me from correcting the problem. I ended up removing and then re-adding the domain to clear the problem. The problem at CloudFlare has been reported to them.
2024-10-23
Unplanned DDoS
At 3pm MST (21:00 GMT), this service was hit by a massive DDoS. Although there was no system compromise, the service was significantly degraded for 24 hours. During this time, we moved the front-end of this service to CloudFlare for DDoS protection. By 2024-10-24 at 5pm MST (23:00 GMT), all accessibility issues had been resolved.
2024-09-15
Updating UPS.
Update completed. UPS rebooted once, causing about 5 minutes of downtime.
2024-09-14
System patching and maintenance.
No unexpected problems. Total downtime about 5 minutes.
2024-08-17
System patching and maintenance.
No unexpected problems, total downtime about 5 minutes.
2024-07-20
System patching and maintenance.
No unexpected problems, total downtime about 5 minutes.
2024-06-15
System patching and maintenance. Unexpected UPS debugging.
UPS driver went crazy and decided to always reboot the UPS during reboot. This is the 2nd time I've had this problem. No longer monitoring the UPS using the Linux 'nut' system (not safe for production use). Going to create my own UPS monitor that will NEVER default to shutting down the UPS. Downtime: 20 minutes total, in 5 minute increments over 1.5 hours.
2024-05-18
Significant code update plus typical updates and patching.
Lots of minor code changes (that add up to a pretty big change) in preparation for switching compilers. (All code must compile cleanly.) Everything passed regression testing, so it's time to push it to production. There should be no downtime. Code update completed: zero downtime. Patches went mainlessly; reboot downtime: 5 minutes.
2024-04-20
Kernel updates and patching.
Kernel patch installed! Previous stress test failed to trigger the crash. (Crossing fingers) This looks like it might resolve the problem. Downtime: about 6 minutes.
2024-04-09
Hard crash and reboot late at night.
I tracked down the bug! It's a kernel bug. Reported it and am looking for a workaround. Total downtime: about 1 hour, but worked on the server from 2am to 5am as I tried to debug the problem.
2024-03-24
Hard crash and reboot.
Yesterday's update generated problems last night. Required rebooting the server. Then the FotoForensics instance reported a drive problem and required a manual fsck. Everything seems happy now. Downtime: 7 minutes.
2024-03-23
Server patching and reboot.
Typical update; reboot took less than 30 seconds.
2024-01-20
UPS battery extender.
Test took longer than expected but was successful. (Had to buy a crimper, ring terminals, and shrink tubing to rig it up.) Good news: Battery extender has been installed. Previously I had about 54 minutes of emergency power. Now I have over 4 hours. (If the power is out longer than 4 hours, then the ISP will go down before me.) Total downtime for this installation: 20 minutes -- 10 minutes while removing the UPS and (a few hours later) 10 minutes while reinstalling the UPS.
2023-12-30
Installing a UPS battery extender.
Could not install. Device arrived damaged from Amazon. Good news: getting a refund. Bad news: no battery extender. I still have about 1 hour of battery.
2023-12-23
Server patching and reboot.
Debugging a possible kernel issue. Debug was performed on the backup server; no downtime on this server.
2023-12-16
Server patching and reboot.
This was a typical update. Zero downtime.
2023-11-18
Debug and correct the UPS monitoring.
Found the problem. The UPS monitoring software ('nut') decided that it was always supposed to shutdown the UPS. Disabled by editing /etc/nut/upsmon.conf and removing the line "POWERDOWNFLAG /etc/killpower". Now every UPS is being monitored properly. (Downtime: about 4 minutes.)
2023-11-11
Server patching and reboot.
During the reboot, the UPS monitor went crazy and decided to automatically turn off the UPS. We were not able to debug the problem within the maintenance window. As a temporarily workaround, we've moved the UPS monitoring to a secondary system.
The current trending pictures may include offensive or disturbing content that is not workplace-safe or appropriate for young children. In rare cases, prohibited content that has not yet been filtered by the system may be included in this automated list.
Please confirm that you wish to see the recent list of popular pictures that have been uploaded to FotoForensics.