|
|
Thread Tools | Display Modes |
Senior Member
Lone Wolf Staff
Join Date: Dec 2016
Posts: 491
|
We are planning a quick hotfix deployment for Hero Lab Online on Thursday (July 30th) at 9pm pacific (4 am UTC 7/31). We expect downtime to be 5 minutes or less.
EDIT: This has been completed. Last edited by SteveT; July 30th, 2020 at 08:07 PM. |
#1 |
Senior Member
Join Date: Jul 2015
Posts: 216
|
No notice, and right in the middle of Gencon sessions.
|
#2 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 8,232
|
Within the product itself, there should have appeared a notice regarding the planned outage. That notice should have appeared roughly TWO HOURS prior to the time and provided an ongoing update of when the outage would occur.
The actual outage lasted for only a few minutes. And given the two hours of advance notice within the product, it should have been practical for GMs to plan for a 5-minute "bio break" at 9pm. We are in the middle of GenCon, so there is simply no good time to deploy anything. However, there were issues that absolutely needed to be addressed. So we waited until later in the night as a "less bad" option. |
#3 |
Senior Member
Join Date: Jan 2013
Location: Rochester, MN
Posts: 1,519
|
As with PaizoCon Online, anyone can go look at Gen Con Online's list of events. Paizo (who is organizing the majority of Starfinder and Pathfinder 2nd Edition events) runs their Thursday/Friday/Saturday events starting at 8 AM, 2 PM, and 8 PM Eastern. Slots are 5 hours long, so shutting off the server at 9 PM Pacific (Midnight Eastern, 4 hours into the slot) meant you probably interrupted the final encounter of a bunch of events.
Sunday's events start at 9 AM Eastern, if there's an emergency. I played in our normal non-virtual game (with paper character sheets!) tonight so I don't know what warnings went out. Last edited by Parody; July 30th, 2020 at 08:53 PM. |
#4 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 8,232
|
The information we had showed numerous games going late into the night. So there was simply no "good" time to do it. We consulted with the person on staff most familiar with the convention gaming schedule (the rest of us have been working round the clock), and she didn't flag a conflict with the outage timing. So we did our best to pick a "less bad" time, and it sounds like we could have been more thorough. I apologize for that.
We have literally been working around the clock to get everything into place for GenCon. And to address the rough edges over the past couple of days for things we didn't catch during our own testing. There are limits to what a tiny team like ours can achieve, and I'm proud of what we've managed to put into place this week. It hasn't been perfect, but it's been pretty darn good. We're all exhausted on this end. I hope everybody has a great weekend gaming and that all of that hard work pays off overall. Last edited by rob; July 30th, 2020 at 11:03 PM. |
#5 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 8,232
|
Addendum: If there are things we can do to improve the outage notification mechanism, please share your suggestions. We've striven to achieve a balance that accurately conveys upcoming outages without being obtrusive. If we need to adjust that balance, or if there's a use-case we haven't covered adequately, we can make the appropriate changes.
|
#6 |
Member
Join Date: Aug 2019
Posts: 35
|
Hey Rob,
Not knowing your infrastructure, is it possible that you could spin up a second front end cluster, deploy update to front end cluster, drain traffic from A to B and then remove A? Obviously, if there are DB migrations, this might be less ideal and would require a lot more heavy lifting. |
#7 |
Senior Member
Join Date: Jul 2015
Posts: 216
|
Rob,
We had 4 in our group, all using HLO. If there was a warning, none of us caught it. Maybe making it a persistent toast until we close it? I know I've seen it in the past, but didn't seem to yesterday, when it was most critical. Also, yeah, the convention schedule, just like at Paizocon, has been out for weeks and is very public. Please at least consider going outside of the main 5 hour blocks as @Parody suggests. And there are no release notes, so we don't even know what was fixed. |
#8 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 8,232
|
Quote:
This was something I wanted in place more than a year. Alas, I then found out the server code had to be completed rewritten (see my comments here for more info). During the rewrite process, I've probably put about 50% of the necessary infrastructure into place to accomplish this, but there's still a meaningful chunk of work left to do. And then a TON of testing. As you surmised, an additional factor has been that most releases (aside from these GenCon hotfixes) entail a bunch of database changes to incorporate the new capabilities we've been steadily adding. That increases the complexity greatly, and definitely wouldn't be supported at first, but we could still use the transition approach for hotfixes that are code changes only, like we've needed the past few days. So it's definitely something I want to do - and have been working towards in pieces - but we're not there yet. My goal is to be there by the end of the year, finishing up the missing pieces interspersed with all the other new stuff that's in the queue. |
|
#9 |
Senior Member
Lone Wolf Staff
Join Date: May 2005
Posts: 8,232
|
Quote:
We're gonna figure out a better way to get the convention game schedule clearly known by the dev team in the future. The release notes went out this morning. We were wiped yesterday. The release notes were properly staged in advance, but we forgot to unveil them once the hotfix was officially deployed. |
|
#10 |
|
|