Archive for the '-Miscellaneous' Category

We’ve just updated the Quality Metrics page, and the numbers show what you already know: April was not a good month for Second Life Grid availability. Our internal outage tracking tool estimates that about 630,000 usage hours were lost to global system failures over the course of the month, which is about 1.9% of the total (up from 0.06% in February and 0.22% in March), and resident surveys clearly indicate great unhappiness coinciding with these failures. (We define lost usage as how much time Residents would have spent logged in but did not, due to Grid failures; it is meant as a global availability metric and does not cover local failures like sim crashes, inventory problems, and the like. See actual[black] vs predicted[blue] concurrency graph excerpt, right.) I’d like to address the causes for this, and what we are doing about it in general terms.
(more…)
SO! We have a small little experimental feature we want you to know about…
(more…)
So our gangs of skilled Moles are ready with the components to expand the Second Life road network. We’ll try to match the roads to the terrain, and to the existing road styles … but our eager little insectivores have some creative ideas, too.
For the first wave of expansion, we’ve identified half a dozen routes in the “Atoll” continent. Here are the proposed routes and the survey. Make your choice!
We’ll give the survey until at least Friday to run, and then we’ll put the LDPW to work. Wave when you see a mole in a distinctive orange vest, and remember to “give ‘em a brake” when you’re driving in a construction zone.
[UPDATE on 1 May 2008] We’ve had 653 responses to the poll as of 2:18 pm SLT today. 35.9% of the respondents chose Route 1 as their first choice, followed by Route 2 with 25.8%. Route 2 had 25.7% of the second choice votes, followed by Route 1 with 20.0%.
Routes 5 and 6 never got more than 7% each in the first and second choice polls … third choice voting was spread pretty evenly among all the routes.
Our moles will begin work immediately on Route 1. Thank you for your votes and the discussion!
One of the changes that went out in the 1.21 Server codebase enables us to alleviate database load caused by “spare” simulators - processes waiting to pick up regions after a restart. Unfortunately, a bug was found that prevents us from enabling the service. The bug did not hold up the 1.21 Server deploy significantly since it affected hosts in only one of our co-location facilities, and the new service was disabled within a few minutes of this being noticed for those hosts.
To send out a fix and reap the benefits of lower database load we need to do a follow-up rolling restart to 1.21.1 Server. (We’re as thrilled as you are.) There are no behavior changes. No new viewer is required. Each region will be given a 5 minute warning and then restarted.
Schedule:
- Monday 4/28, 5-6pm: Pilot roll to 1 or 3 racks
- Tuesday 4/29, 5-11am: Roll to half of the grid
- Wednesday 4/30, 5-11am: Roll to rest of the grid
[Resolved at 10:50pm Pacific] Our Support Portal is back online!
[Updated at 10:10pm Pacific] Maintenance is in progress. — Frontier
As reported earlier this week, our support portal will be offline for system maintenance tonight, Saturday, 26th April.
Our software supplier has reduced the length of the downtime from six hours to three, from 9:00pm-Midnight PDT.
During that time, the support portal will be unavailable for chat or ticketing services.
Apologies for any inconvenience this may cause to you.
[RESOLVED 10:00 PM PDT] These issues have been resolved. Thanks for your patience.
[UPDATE 8:38 PM PDT] We are still trying to nail down the problem. Thank you for your continued patience.
[REOPENED 6:25 PM PDT] We are currently experiencing some Asset Server Issues and investigating the cause. Updates will be posted here when we have further information. Thank you for your patience
[RESOLVED 5:00PM PDT] Everything should have returned to normal by now as the repairs have finished. We apologize for any inconveniences you may have encountered during this time.
[UPDATE 4:00PM PDT] We still have not gotten word on the status of the repair. We apologize that this is taking so long. We hope to receive word soon.
[UPDATE 3:25PM PDT] Our vendor has dispatched an engineer to the facility. We are waiting for them to complete the repairs so we can bring the system back up to full speed. You may still notice intermittent problems until those repairs are completed. Further updates to follow.
[UPDATE 2:40PM PDT] The problem has been traced to a faulty piece of equipment and we are attempting to have repaired or replaced. Once that has been done things should return to normal. We will keep you updated on our progress.
[2:14PM PDT] We are investigating intermittent database and asset server issues. These may affect logins, teleporting, viewing scripts/notecards, and rezzing items in world. We are aware of the problem and working to isolate the source. We will have updates as we have more information. -Chiyo
[8:51 AM - RESOLVED] - The manual process has been completed and all group payouts have been run. - Matthew
[8:38 AM - Update] - Our engineering colleagues have identified the issue and are running the group payout process manually at the moment. It should be completed very soon and all groups will have paid out.
[8:17 AM - Update] - No further information on this at the moment.
We’ve been told by residents their nightly group distributions have not yet occurred. Upon further inspection we have found this may be true for other groups as well. We are investigating and will update when group distributions have completed or when we know when we can expect them to do so. -Chiyo
[Updated Saturday @ 09:10am] The rolling restart of the rest of the grid is now complete.
[Updated Friday @ 8:39am] The rolling restart to half of the grid is now complete but for 7 hosts that needed to be manually updated; those will be completed within a few minutes. The rest of the grid will be updated tomorrow morning.
[Updated Thursday @ 7:10pm] We are beginning have completed the deploy of 1.21 to 3 racks (632 regions). Here is a list of regions that as of now are on version 1.21.0.85745.
[Updated Thursday at 12:47pm] We will shortly be deploying have deployed 1.21 to 1 rack (about 170 regions) again. If all goes well, we will continue with the tenative timeline listed in the Wednesday at 8:10pm update below.
[Update Wednesday @ 9:15pm] A slight and subtle wrinkle during the deploy left some object-to-object emails non-functional. The responsible systems have gotten a stern talking to, and this service should be operational again.
[Update Wednesday @ 8:10pm] Another bug was found after we rolled out to one rack. That bug has been found and fixed. We will evaluate exactly what we’re going to do with this deploy after testing tomorrow, but it will likely shift the timeline forward by one day. Meanwhile, we are rolling back the 170 regions that had previously received a 1.21 deploy so that for all simulators are once again running on version 1.20.1 of the server code.
The central updates to 1.21 are complete and things seem “nominal” at the moment, but of course we’ll be watching closely.
- Wednesday 4/23 @ 11am - deploy to 1 rack [DONE] [REVERTED]
- Wednesday 4/23 - update central systems throughout the day [COMPLETE]
- Thursday 4/23 @ 6pm - deploy to 3 racks [COMPLETE]
- Friday 4/25 @ 5am-11am - deploy to half of remaining servers
- Saturday 4/26 @ 5am-11am - deploy to remaining servers
[Update Wednesday @ 10:25am]
The bug in the 1.21 Server code identified last night during an initial rollout to 1 rack has been found, fixed, and verified. We’d planning to proceed with the rollout to avoid delaying the code update another week. On the table for today are the central services updates and limited rolling restarts.
What’s Changed in 1.21 Server
The most notable fixes will be physics-related, and have been in testing in the Beta Preview for several days. No new viewer is required.
Read on for more information…
(more…)
[CLOSED 11:07 a.m. --teeple]
All regions affected have been returned to service. Thanks for your patience!
[UPDATE 10:08 a.m. --teeple]
The vast majority of regions are back online. Operations is still working on a small number of them. We’ll let you know as soon as they’re done.
***
As a result of a recent Rolling Restart, many regions in Second Life are currently down. Our Server Team is working quickly to get them back online as soon as possible. More Updates here as information is received. Thank you for your continued patience.
[UPDATE: 2008-04-24 9:27am PDT] Thanks for the quick feedback on 1.20 RC3. We’ve found and fixed the cause of the crash on editing appearance (VWR-6792) and will be making 1.20 RC4 available later today. We’ve removed RC3 from the download page. Please use 1.20 RC2 or the 1.19.1 primary download until RC4 is available.
Hi again everyone.
We’re back to weekly Release Candidate releases. We increased the frequency of Release Candidate viewers last week in an attempt to gather more info/feedback on the Nvidia issues that have been reported (VWR-6343). Second Life 1.20 RC3 is now available and includes a batch of fixes to issues reported in the previous 1.20 release candidates (RC0, RC1 and RC2) and new code that we believe should address the Nvidia issues. Please visit the test software page to download the Second Life 1.20 (RC3) Release Candidate viewer.
Fixes: (more…)
|
149