Rolling Restart - Complete
Thursday, July 5th, 2007 at 9:51 AM by: Joshua LindenWe’re initiating a rolling restart of the grid to pick up a fix for this issue:
SVC-385 Huge decrease in simulator performance after 2007-06-28 rolling restart
Thanks to Ruud Lathrop for reporting the issue. We were able to isolate the issue on Tuesday and test out the fix yesterday and are ready to roll the update today. The restarts occur in a “wave” that sweeps North to South on the map. Each region will restart after a five minute warning. It will take 2-3 hours for the restart to hit the whole grid.
[Update: 12:50pm - The rolling restart is complete. Thank you for your patience.]


July 5th, 2007 at 9:55 AM
YES!
I don’t exactly know what you mean by “rolling restart”…but it sounds good!
July 5th, 2007 at 10:10 AM
When will the Lindens listen to us about bugs? It’s like they don’t care, they’re asleep at the switch! Why, we’ve had a 50% reduction in sim performance, and… uh…
Oh, wait. Nevermind!
Thanks, guys.
July 5th, 2007 at 10:11 AM
Thanks for the swift response…….can we try to not let the simulators lose performance in the first place though, have you tried a defrag?.
July 5th, 2007 at 10:14 AM
THANKS FOR THE FIX…..BUT THIS SHOULD HAVE BEEN NOTICED BEFORE YOU RELEASED THE LAST UPDATE…….HOW ARE YOU GUYS TESTING YOUR UPDATES???? THIS IS REALLY BASIC STUFF. SEEMS NEW FEATURES CONTINUE TO BE A PRIORITY, RATHER THAN PERFORMANCE. PLEASE, PLEASE LISTEN TO THE RESIDENTS????????
July 5th, 2007 at 10:17 AM
LOL WE BEEN TELLING YOU THERE WAS A PROB AFTER THE RESTART ON 6/28
July 5th, 2007 at 10:19 AM
Ingie
July 5th, 2007 at 10:24 AM
Just a question. IF it’s not a security risk, could you guys possibly provide an explanation of what actually caused this problem?
I’m really curious as to what could cause such a drastic performance hit.
July 5th, 2007 at 10:36 AM
Whew. And I thought it was just me going insane…
July 5th, 2007 at 10:39 AM
Question: is this north-to-south by rows or columns?
I.E. is the restart order:
123
456
789
or
147
258
369
This doesn’t seem clear on the blog post.
July 5th, 2007 at 10:41 AM
Does this address the intermittent llEmail Lag?You know, the lag that’s causing networked vendor systems like “JEVN” and such to fail?
https://jira.secondlife.com/browse/MISC-153
July 5th, 2007 at 10:58 AM
I’ve get no Warnings, was only logged out by an Admin as I’ve work on my Land and as I’ve try to log in again, I’ve get the Message I have to wait until 11:23.
If this was a Restart, why it need around 30 Minutes?
July 5th, 2007 at 11:02 AM
Is this going to also address the issue of land taking on the name and settings of a neighbor’s land?
We have not heard back from LL about our ticket.
July 5th, 2007 at 11:03 AM
I JUST CANNOT BELIEVE THAT NO-ONE AT LINDEN COULD SEE THIS BEFORE THEY RELEASED THE UPDATE? THIS IS REALLY FUNDAMENTAL STUFF. MAYBE THE PEOPLE WORKING ON THE “FIXES” DONT TRY THE OVERALL SYSTEM? WHY DOESNT LINDEN USE SOMETHING LIKE THE CONOVER SIM PERFORMANCE MONITOR, THEN THEY WOULD SPOT THIS SORT OF THING INSTANTLY? I REALLY AM DUMBFOUNDED THAT SUCH A HUGE PROBLEM CAN GET OUT ON A LIVE SYSTEM. TOTAL IMCOMPETENCE AND LACK OF MANAGEMENT SUPERVISION. GOOD JOB THEY DONT WORK FOR NASA.
July 5th, 2007 at 11:06 AM
using ALL-CAPS don’t make you smarter.
July 5th, 2007 at 11:08 AM
@ 10: YES BUT USING ALL CAPS MAKES IT MORE FUN!!!!1
July 5th, 2007 at 11:09 AM
You made folks work yesterrday.. BAD BAD Linden’s :(. Have pity on their poor souls and give them a 3 day weekend. Make them smile
July 5th, 2007 at 11:10 AM
Oh it only took a week to fix this problem that should have been caught before the patch was rolled out. I read many inane comments in the JIRA about this problem saying well I didn’t see any slow down. Boys and girls use the stats tab, if you dont know where it is ctl shift 1 then go to the bottom and click times. If you had been using this tool to check your land or sim you would have noted that your frame time had doubled and that was because the script times had doubled. This problem was even more apparent in estate tools in the script analysis area. As a final solution conover’s sim performance meter is a very handy tool for watching the your land. This tool indicated a 50% performance drop. Every sim I visited showed this reduction in performance. Another line in that handy tool is the script performance, during the slow down script performance was also quite out of specifications.
Another minor point Linden’s why do we have a bug report in the menu you do not read it seems as I filed a report a day there about this problem
July 5th, 2007 at 11:10 AM
@ 10: YES BUT USING ALL CAPS MAKES IT MORE FUN!!!!11ONEone
**edited to provide more internat enjoyment!**
July 5th, 2007 at 11:11 AM
I assume the good CAPTAIN CAPS is one of the beta testers? Perhaps this stuff won’t show til there is a full load of users? There are lots of reasons why they might not be able to see things beforehand…
July 5th, 2007 at 11:13 AM
Wise Clapsaddle Says:
July 5th, 2007 at 10:11 AM PDT
Thanks for the swift response…….can we try to not let the simulators lose performance in the first place though, have you tried a defrag?.
—–
Shut up. Seriously. Defrag? Are you living inthe 90s? There is ABSOLUTELY NO REASON TO DEFRAG. None. Unless you are using the FAT16 or FAT32 filesystems. And if you are, you deserve what you get.
Non-techies. Stop spouting crap, backwards, nonsensical tidbits. You aren’t helping. You are making yourselves look bad. Just stop.
CAPTAIN NORALUNGA Says:
July 5th, 2007 at 10:14 AM PDT
THANKS FOR THE FIX…..BUT THIS SHOULD HAVE BEEN NOTICED BEFORE YOU RELEASED THE LAST UPDATE…….HOW ARE YOU GUYS TESTING YOUR UPDATES????
—–
Chances are, they can’t test it on a live grid with 30K concurrent users. It’s kind of hard when you don’t have 30K employees with nothing to do.
Someone with a clue reported and probably gave them relevat info to help them track down the problem. Screaming “IT IS SLOW FIX IT NOW I CAN’T HAVE E-SEX AS WELL AS BEFORE WAAAAHHH!!!” can’t get the problem fixed faster. They need info about what you or others are doing when you notice the slowdown. they need to know how many people are in the sim. they need to know what settings are on your client.
I really wish LL could require a basic level of computer literacy before they let the mouth breathers in….
July 5th, 2007 at 11:20 AM
@15
That is why there is a Beta Grid….. DUH!
@10
Please use correct grammar …… doesn’t NOT don’t
doesn’t = does not don’t =do not
AND I LIKE USING CAPS ……. MAKES IT EASIER TO READ WHEN YOU ARE SHORT-SIGHTED OR SHORT-MINDED.
July 5th, 2007 at 11:20 AM
#16 - exactly…. at least the lindens dont go to the mouth breathers place of work and tell them how to make the cinnabons.
July 5th, 2007 at 11:22 AM
@16
I just love you Linden ass-lickers…….get a second life…your first one is not working.
July 5th, 2007 at 11:23 AM
@16 Hadley Yoshikawa
Ditto.
Thank you Lindenonians!
Thought my ‘puter was wonking out on me.
Question (no I don’t really expect an answer here - more rhetorical) - NO FLY is set on my estate - but even though NO FLY was set - it turns itself back on after a couple hours - even though the NO FLY icon is still showing and all that, LOL.
Will this fix you have identified fix that, too? (here’s hoping)
Thanks for the hard work and keep your chins up. All the whiners and crybabies and boneheaded-dweebs like almost all the above posts always come out of the woodwork whenever you announce that you are -FIXING- something.
LOL
Go figure :\
July 5th, 2007 at 11:25 AM
While I didn’t feel this performance decrease, anything that helps improve performance is A-OK to me. I wonder though if dis explains some of the troubles with low or non-existant sales people are talkin’ about on the SL Forums.
July 5th, 2007 at 11:26 AM
CAPTAIN NORALUNGA Says:
July 5th, 2007 at 11:22 AM PDT
@16
I just love you Linden ass-lickers…….get a second life…your first one is not working.
—–
On the contrary, my first life is great. I’m just not an idiot and I understand the concept of load testing. If you can’t test with 30K users, you can’t know what problems may crop up. Screaming like a retarded howler monkey doesn’t do anything. Rationally providing relevant data does.
And I see you don’t understand the difference between programmer and Linden ass-licker. Not surprising, really.
July 5th, 2007 at 11:30 AM
Completly unrelated to anything with SL, but reading one guys comments slagging off another,.
I work for a technical support company,.defrag is still valid for NFS as much as it was for 32 or 16. Files still get spread around on a harddisk regardless,.and it can still affect performance. It wont help much in this context, that much is true,.,.
July 5th, 2007 at 11:32 AM
CAPS LOCK IS CRUISE CONTROL FOR COOL.
PROTIP: WITH CRUISE CONTROL, YOU STILL HAVE TO STEER.
Thanks for the fix, Lindens. Regardless, it should’ve been caught beforehand, but it’s good to see you recovering from your mistakes quicker.
July 5th, 2007 at 11:32 AM
I reported this and the support response was good, but they blamed my pretty little butterfly rocks and removed them all!
July 5th, 2007 at 11:33 AM
I just had a strange bug after the sim was restarted, I came back and I found my new building (where i was working on just before the restart) not as I left it before the restart of that sim. Textures were back to 1×1 and some were even missing! not sure yet whats more wrong (if there is more…). Lagg is way to high still, should be pretty low in this area.
July 5th, 2007 at 11:35 AM
Glad to hear that *a* server problem was found and fixed… however…
I *really* wish the Group Chat problems would get taken care of.
- Constant “Error messaging chat session…” pop-ups.
- Multiple sent lines show up out of order, *if* they show up at all.
- 5+ minutes for a message to get through to the group.
- 30+ minutes will go by, with no chat from anyone, and then several lines that are obviously part of a conversation taking place that is not coming through to everyone.
Relevant Jiras:
https://jira.secondlife.com/browse/VWR-1297
https://jira.secondlife.com/browse/VWR-1298
https://jira.secondlife.com/browse/VWR-1323
https://jira.secondlife.com/browse/VWR-513
Please get some attention on this!!!
July 5th, 2007 at 11:42 AM
Well out of misery comes a jewel, reading these comments I came across the ctl-shift-1 combo, and now i know that my computer really is marginal for running SL. Able to do a whopping 3 fps displaying the game. No wonder i never saw a slowdown, i was slow already. I guess I can file this tip with some of the other gems that I have gleaned. I’m glad somebody is able to run fast enough to notice the difference.
July 5th, 2007 at 11:45 AM
give the roll time to finish and the grid some settle time Youri. More than likely your sim was at the wrong end of a sim state save when it got reset, so your work relapsed a lil less than an hour. Always a good idea to grab a copy of something youre building before a reset.
July 5th, 2007 at 11:48 AM
FYI, the source of the issue was tracked down to an inadvertent change in the compiler optimization flags used to build the simulator executable. This change was reverted, and the simulator executable was rebuilt with no other code changes.
July 5th, 2007 at 11:52 AM
@ 26: Joshua Linden says: FYI, lowercase gibberish…
IF THE REST OF YOUR POST IS NOT IN ALL-CAPS THEN HOW ARE WE ALL GOING TO BE ABLE TO READ AND UNDERSTAND IT??//
July 5th, 2007 at 11:54 AM
Thanks Joshua ….. is the person responsible now seeking alternative employment?
July 5th, 2007 at 11:59 AM
CAPTAIN NORALUNGA, there are nicer ways to express dissatisfaction rather than name calling or being hostile. All you are doing is lowering our respect for your opinions by a great deal. Like to none.
The Beta grid consists of so few sims and even less load that it will not catch everything. People who actually spend time there already know this, so we’re more concerned with testing what we can, such as build tools.
Even the QnA team’s ‘bot stress test’ cannot catch everything. A ‘bot’ cannot use it’s own imagination to do anything, thus cannot fully test the flexibility of the code to adapt. I mean, a lot time ago, if we rezzed and linked two torus’s together, it’d crash the sim! Bots aren’t going to catch it.
As for why the coders don’t catch it? Have you ever spent hours working on a paper for a class, then read it repeatedly, correcting errors as you found them? When you turned it in and then got it back, the teacher had found even MORE errors? That’s what this is like for them.
Yeah, I’m such a LL sycophant, who hates their customer support system.. or lack thereof. But I give credit where credit’s due: Yay, poor coders and testers who worked on the holiday to fix this issue!!! You guys deserve an extra day off!
July 5th, 2007 at 12:00 PM
@16 Quote: “Shut up. Seriously. Defrag? Are you living inthe 90s? There is ABSOLUTELY NO REASON TO DEFRAG. None. Unless you are using the FAT16 or FAT32 filesystems. And if you are, you deserve what you get.”
You still have to defrag with NTFS. Maybe you should redirect all that bitterness into learning some facts :p
July 5th, 2007 at 12:02 PM
Great…. so now I cant put on any clothing….
1. I have a shirt that inventory says is off- but I still see it
2. I put on clothing and no one sees it, including me usually
I have treid to rebake over and over… no joy
What am I supposed to do? I can’t go in public like this
July 5th, 2007 at 12:03 PM
Captain Noralunga Said:
“doesn’t = does not don’t =do not
AND I LIKE USING CAPS ……. MAKES IT EASIER TO READ WHEN YOU ARE SHORT-SIGHTED OR SHORT-MINDED.”
No, it doesn’t.
Lindens, thanks for fixing it. I noted the slow-down and thought it was my own fault… maybe the internet connection slowed down or I talked too much… but now I am happy that you are to blame
July 5th, 2007 at 12:07 PM
So the guy issued the “make” command and didn’t check the compiler settings… nice…
This reminds me of the popular Gnome VS Kde battle on Linux, but this particular article springs to mind. Have a read here… http://www.illusionary.com/GNOMEvKDE.html
Guess which environment I’m reminded of :-p lmao (just to be on the safe side… it’s not kde)
July 5th, 2007 at 12:08 PM
@28 CAPTAIN NORALUNGA…probably not. People make mistakes, and are usually forgiven…What if you worked for LL as a developer. If you make a simple mistake like that, how would you feel if you lost your job because of it?
July 5th, 2007 at 12:11 PM
@29
if you read the blog, you will see that I reacted to someone elses “name calling ” and “slagging off”. This is a commercial operation which takes a substantial amount of money from subscribers, and as such, the subscribers have a right to expect competence from the service provider. It is not advertised as a beta testers paradise. yes, it is leading edge, but if you read Joshua Lindens explanation of why this happened, INADVERTENT basically means INCOMPETENT. Setting the wrong flags on a compile is something that should not happen. Just in case you all think I am a whiner who doesnt take part in serious fault reporting, look at JIRA and you will see that I submitted as much information as I could to try and identify where the problem may have been. As a matter of interest, those who criticized me were noticeably absent on the JIRA. So, to those peoople…..go screw yourselves.
July 5th, 2007 at 12:19 PM
Wow, the grid relies that heavily on compiler optimization?
July 5th, 2007 at 12:22 PM
wow, another impressive display of trolling. anyways, happy to hear the problem has been fixed, and even more so the fix was such a simple one
really looking forward to the het-grid, when we finally can betatest under full load
July 5th, 2007 at 12:23 PM
Darien Caldwell Says:
July 5th, 2007 at 12:00 PM PDT
@16 Quote: “Shut up. Seriously. Defrag? Are you living inthe 90s? There is ABSOLUTELY NO REASON TO DEFRAG. None. Unless you are using the FAT16 or FAT32 filesystems. And if you are, you deserve what you get.”
You still have to defrag with NTFS. Maybe you should redirect all that bitterness into learning some facts :p
—–
Which might be relevant. If they were running Windows servers. Which they aren’t. You don’t need to defrag real journaled filesystems.
Speaking of facts….
July 5th, 2007 at 12:26 PM
@17 : common non-english mistake.
@16&30 : yes, ntfs seriously need defrag.
Anyway…. simulators run on linux, and a descent filesystem (ext3 ?). Any modern and well designed filesystem don’t need defrag.
@26 : thank you
July 5th, 2007 at 12:28 PM
Great! for the SECOND WEEK IN A ROW my event is ruined by rolling restarts! Thanks guys.
July 5th, 2007 at 12:33 PM
Not all Lindens are US based. So not all were on holiday.
Brits for instance do not celebrate 4th July.. Why would we
July 5th, 2007 at 12:38 PM
Garth - It means you can’t take “credit” for our current administration. I’d be celebrating if I were you…
July 5th, 2007 at 12:41 PM
—-ROLLING RESTART HAS SIGNIFICANTLY SLOWED SCRIPTS!!!—-
I was working in my home-sim and everything was going fine. The rolling restart hit my sim…and then I went back. ALL scripts then started taking about 2 minutes to fire on their first instance!
This procedure was duplicated multiple times for multiple scripts that have had no bugs/flaws/etc. before the rolling restart…
July 5th, 2007 at 12:42 PM
just a side note with this decrease in performance i have notice that items received from my vendors ….. textures that were full permissions are now changed to no copy and it also didn’t take the money and still gave the textures out so everyone check there permissions on items i had two customers this has happened with on my textures in the past few days received what were to be full perm textures they get as no copy and my transactions say zero also my other vendors are malfunctioning as well
July 5th, 2007 at 12:44 PM
hahaaa i knew it! thats why i didnt install the new upload!!
July 5th, 2007 at 12:47 PM
Kenn - Rolling restarts are going to tax all parts of the grid while in progress. Database servers are hit as each sim has to load all rezzed items, which also slows communication between servers because of hte sheer amount of data to pull.
I would hold off on bug reports until the rolling restart is finished just to make sure it isn’t an issue of an incomplete grid. That said….I logged before the restarts hit so that I didn’t have to experience any such problems.
July 5th, 2007 at 12:48 PM
k all our comments should prove to sl something! u guys work amongst yourselves and still have problems, WHY CANT WE WORK TOGETHER!!!!!!!! TOGETHER WE CAN DO AMAZING STUFF AND MORE PEOPLE CAN BE HAPPY!
July 10th, 2007 at 6:23 PM
[...] combined with the rolling restart yesterday will hopefully fix some of the lag we’ve been [...]
May 25th, 2008 at 2:40 AM
Blog Celebrity Baby Blog Photo Blog…
I didn’t agree with you first, but last paragraph makes sense for me…