Art is Art and Water is Water

February 14, 2019

Teaching an old ERROR new tricks

Filed under: Twitter Threads, Windows — foone @ 6:54 pm

(Original Twitter thread posted on November 3, 2018)

It is 2018 and this error message is a mistake from 1974. This limitation, which is still found in the very latest Windows 10, dates back to BEFORE STAR WARS. This bug is as old as Watergate.

When this was developed, nothing had UPC codes yet because they’d just been invented. Back when this mistake was made, there was only one Phone Company, because they hadn’t been broken up yet. Ted Bundy was still on the loose. Babe Ruth‘s home run record was about to fall. When this bug was developed, Wheel of Fortune hadn’t yet aired. No one had seen Rocky Horror. Steven Spielberg was still a little-known director of TV films and one box-office disappointment. SNL hadn’t aired yet. The Edmund Fitzgerald was still hauling iron ore.

WHEN THIS STUPID MISFEATURE WAS INVENTED, THE GODFATHER PART II HAD JUST OPENED IN THEATERS.

So, why does this happen?

So Unix (which was only 5 years old at this point) had the good idea of “everything is a file” which meant you could do things like write to sockets, pipes, the console, etc with the same commands and instructions. This idea was brought into CP/M by Gary Kildall in 1974. You could do neat things with it like copy data off the serial port into a text file, or print a text file right from the command line!

This is done in unix by having special files existing in special folders, like /dev/tty for the console or /dev/lp0 for the first printer. You can get infinite zeros from /dev/zero, random bytes from /dev/random, etc!

But here’s the problem: CP/M is designed for 8-bit computers with very little memory, and no hard drives. At best you’ve got an 8″ floppy drive. So directories? You don’t need ’em. Instead of directories, you just use different disks. But without directories you can’t put all your special files over in a /dev/ directory. So they’re just “everywhere,” effectively. So if you have FOO.TXT and need to print it, you can do “PIP LST:=FOO.TXT” which copies foo.txt to the “file” LST, which is the printer.

And it works where ever you are, because there are no directories! It’s simple. but what about extensions? Here’s the problem: programs like to name their files with the right extension.

So if you’re running a program and it goes “ENTER FILENAME TO SAVE LISTING TO” you could tell it LST to print it or PTP to punch it out to tape (cause it’s 1974, remember?) but the program might try to put .TXT on the end of your filename! LST.TXT isn’t the printer, right? Nah. It is. These special devices exist at all extensions, so that this works. So if “CON” is reserved to refer to the keyboard, so is CON.TXT and CON.WAT and CON.BUG

Eh. It’s a hack, but it works, and this is just on some little microcomputers with 4k of ram, who cares?

Well CP/M caught on widely through the late 70s and early 80s. It was one of the main operating systems for business use. It defined an interface which meant you could write CP/M code on a NorthStar Horizon and run it on a Seequa Chameleon.

The lack of a portable graphics standard kept it out of the games market for the most part (though there are Infocom releases) so it was mainly business users. But it was big, so naturally IBM wanted it for some “PC” project they were doing in early 1980.

So IBM intended to launch the IBM PC with several operating systems, and were expecting CP/M to be the “main” one. But CP/M for the x86 didn’t come out until 6 months after the IBM PC launched… and it cost $240 vs $40 for DOS. So the vast majority of users ended up using Microsoft’s PC-DOS, which was an evolution of a new OS developed by Seattle Computer Products. MS purchased Tim Paterson‘s project and developed it into PC-DOS (which later became MS-DOS, if you’re not aware).

Tim Paterson’s OS was called “QDOS”, for “Quick and Dirty Operating System”. It was basically written because CP/M didn’t have an x86 version yet, and an attempt to solve some of the limitations of CP/M. It was definitely inspired by CP/M, in a lot of ways.

One of those main ways was keeping the idea of special files and no directories, because that was a useful feature of CP/M. So QDOS and PC-DOS 1.0 have AUX, PRN, CON, LPT, etc, too!

For PC-DOS 2.0 released in 1983 for the new IBM XT, Microsoft significantly revamped PC-DOS. The IBM XT featured a hard drive, so PC-DOS needed directories support. You need them to keep your massive 10mb hard drive organized, obviously! But here’s the problem: Users have been using these special files since PC DOS 1.0 release two years earlier. Software has been written that uses them! Batch files have been written that support them. With directories, Microsoft could now make a C:\DEV folder …

But they didn’t.

For what wouldn’t be the last time, Microsoft sacrificed sanity for backwards compatibility: Special files are in EVERY DIRECTORY with EVERY EXTENSION. So your “DIR > LPT” trick to print the directory listing doesn’t break because you’re in C:\DOS instead of A:\

But we’re not running DOS 2.0, of course … And when Windows 95 was released, it was built on top of DOS. So it naturally inherited this behavior. (Windows 1/2/3 similarly did, but Win95 was much more an OS than they were).

But hey, we’re not running Windows 95 anymore! The current branch of windows is based on Windows NT, not Win95. But Windows NT wanted compatibility with DOS/Windows programs. And XP merged the two lines. So these special files still work, FORTY FOUR FUCKING YEARS LATER.

Feel free to try it yourself! Open explorer, do “new text file” and name it con.txt aux.txt prn.txt it’ll tell you: NOPE!

So because of Gary Kildall going “Special files representing hardware devices! That’s a neat idea, Unix. I’ll borrow that idea and try to hack it into my toy-computer OS” so long ago that people born that year can have children old enough to drink … we can’t name con.txt

Microsoft gives the official list here: CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9

For extra fun, accessing C:\con\con (or C:\aux\aux) on win95 would cause it to bluescreen instantly. That was hilarious back in 1995, because it was a 21-year-old bug! Imagine some mis-design hanging on that long?

Bonus: Here’s a picture of Tim Paterson at the Vintage Computer Festival: West this August, giving a talk on the history of DOS.

And if you want the backstory for how I got into this mess where I have a file I can’t copy: These special-device names are implemented at the OS level, rather than the file system level. So they’re perfectly valid NTFS filenames, and I was using an NTFS drive in linux. And apparently OS/2 didn’t implement these special names either, cause IBM shipped some opengl headers as AUX.H on one of the OS/2 devcon disks.

So today I was trying to backup this NTFS drive onto my main PC and WHOOPS CAN’T COPY ALL FILES CAUSE OF BUGS OLDER THAN MOST PEOPLE READING THIS

Advertisements

Normalization of Deviance

Filed under: Thoughts, Twitter Threads — foone @ 6:05 pm

If I was setting up curriculum at a university I’d make an entire semester-long class on The Challenger disaster, and make it required for any remotely STEM-oriented major.

Because I think it’s too easy to think of it as just a random-chance disaster or just space/materials engineering problem that only has lessons relevant to that field. And that’s not really the most important lesson to learn from the Challenger disaster!

Because, yeah, you can look at it as just a random-chance disaster, like a natural event. You could look at it as a lesson on problems in rocket design, as the problems with the shuttle program, as the risks we have to take to explore, but most importantly, in my opinion, is the lessons on “Normalization of Deviance.” The Challenger disaster wasn’t a single mistake or flaw or random chance that resulted in the death of 7 people and the loss of a 2 billion dollar spaceship. It was a whole series of mistakes and flaws and coincidences over a long time and at each step they figured they could get away with it because they figured the risks were minimal and they had plenty of engineering overhead. And they were right, most of the time…

Then one day they weren’t.

Normalization of deviance is the idea that things are designed and limits are calculated. We can go this fast, this hard, this hot, this cold, this heavy. But we always want to optimize. We want to do things cheaper, quicker, more at once.

And the thing is, most of the time going a little faster, a little hotter, that’s fine. Nothing goes wrong. Engineers always design in a safety margin, as we’ve learned the hard way that if you don’t, shit goes wrong very fast. So going 110% as fast as the spec says? Probably OK. But the problem is what if you’ve been doing that for a while? You’ve been going 110% all the time. It’s worked out just fine. You’re doing great, no problems. You start to think of 110% as the new normal, and you think of it as just 100%.

You probably don’t rewrite the specs to say the limit is 110%, but you always have the official rules and the “way things are done.” And everyone knows those don’t always exactly align…

Like your job’s security says to never reuse passwords and never write them down and they have to be 20 characters and 4 digits and upper and lower case and 3 Sanskrit characters. The computer tests all those except “never write it down.” Guess which one gets violated? And everyone does, because the alternative is not getting work done because they’re waiting on IT to reset their password. And this just becomes the unwritten How Things Are Done, despite the written How Things Are Done saying explicitly not to do this. And you do this in your office, and you think the stakes are low. And they probably are. But this kind of thing doesn’t just happen to some punks in an office doing spreadsheets. It happens to actual rocket scientists.

So when the spec says 100% and you’ve been doing 110% for the last 20 missions and it seems to be working just fine, and then one day you’re running into 5 other problems and need to push something, well, maybe you do 120% today? After all, it’s basically just 10% of normal. Because in your head you’re thinking of the 110% as the standard, the limit. You’ve normalized going outside the stated rules, and nothing went wrong. So why not go a little more? After all, 110% was just fine…

But the problem is that there’s no feedback loop on this. There’s often no obvious evidence that going outside the “rules” is wrong. Steve wrote down his password and it’s not like he got fired for doing that. So why not do it too?

And the feedback you do eventually, finally get might just be completely disastrous, often literally. You don’t get any “HEY STOP WRITING DOWN YOUR PASSWORDS” feedback until the whole company gets hacked and your division is laid off.

And the feedback you do get is misapplied. Like, I like to joke that my roommate’s cat is very smart. We want to keep her off the kitchen counter for sanitary reasons, so whenever we see her on the counter, we spray her with water. So she learned: never go up on the counter … when there’s someone there to see you. Susan gets in trouble cause she put a post-it note with her password on her monitor, and we had to sit through a boring security meeting about password security. So people learn. They put their passwords in their wallet and in their phone.

This is a silly example, but it’s also exactly what happened with Challenger. The O-rings on the solid rocket boosters had a problem where hot gases would leak past them during lift-off, but every time this happened, the O-ring would shift and reseal the leak. So it was a thing that was never designed to happen, but when it happened and seemed to be fine, they wrote it into the documentation. It was now just a thing that happened. Gas will escape past the O-rings, but it’s okay, they self-seal. And as long as everything was within original operating parameters, this’d be fine. But other things were pushed.

The Challenger launch was repeatedly scrubbed because of minor issues in other components, or cross-winds that were too high. And then NASA finally thought they had a day they could launch, but with one problem: it was too cold.

And it seems a silly thing to worry about it being too cold to launch a SPACE ROCKET but when you design things you have to decide what temperature range they need to operate in. You gotta pick materials and do tests to fit that range. If your rocket is only going to take off in temperatures from 40 degrees F to 90 degrees F, you pick certain materials and test in those temperatures. If you had to launch at colder or hotter, you might need different materials and more expensive tests. So you decide on limits.

But you’ve launched at 40F and it was fine, and then one day you had to launch at 35F and it was fine, and then on a particularly bad day you had to launch at 30F and you’re fine. So you normalize this deviance. You can launch down to 30F, if you really have to. But then one day you’ve missed a bunch of launch windows and it’s 28F and the overnight temperatures were 18F but you did a quick check of the designs and specs and you probably have enough safety margin to launch, so you say GO.

And you discover 73 seconds into the flight that the O-rings that seemed to always self-seal? They don’t self-seal if they’re too hard and brittle from the cold. The gases keep leaking. The hole widens. High pressure high-temperature gas comes out the booster rocket and starts to melt the attachment joints between the boosters and the external tank. It happens at the time when the rocket is undergoing the strongest stresses from take-off, and the tank fails. The solid rocket boosters separate from the now disintegrating orbiter stack and have to be destroyed by a range safety officer. The crew probably survived in the reinforced cabin until it struck the ocean.

And it’s important that the lesson we learn from this isn’t as narrowly focused as “the space shuttle was badly designed” (it wasn’t! It was a compromised design that had lots of amazing work poured into it) or even “don’t launch spacecraft outside their design specs.” Because the thing about Normalization of Deviance as a concept is that it applies to all sorts of engineering issues, and not just mechanical engineering!

Like, think about a road: You know it’s going to be a 50 MPH road, so you design it as such. You don’t put sharp turns in a road where people are going 50MPH, because you know if people try to take them at 70 MPH they’ll crash. And people always push the limits. So you build your “50 MPH” road knowing people might be going 70 MPH. You design your turns & signage for that range. And the road opens and it works perfectly at 50MPH.

But some people go 70MPH, which is fine, you planned for that. The police stop a few of them. But as people go on the road and get used to it, they start going 60 MPH, just cause they can and nothing bad seems to happen. The normal becomes 60 MPH. So now the averages have shifted. You designed for 50 (with a +20MPH safety range) and now most people are doing 60 MPH, and the ones going a little fast do 70 MPH, and the ones going Extra Fast do 80 MPH. And maybe that seems fine. The people going fast know the risks they’re taking so they pay extra attention (for police cars, if nothing else). And it’s fine, for a while.

Then it rains, and what was safe at 50 MPH, borderline at 70MPH, and risky at 80 MPH is now borderline at 50MPH and risky at 60MPH and deadly at 80MPH. And a bunch of people crash. And they crash because they normalized the “rules-in-practice,” of “go 60, go 70 if in a hurry, go 80 if an emergency.”

My point with this is not to say “HEY PEOPLE STOP BENDING THE RULES,” exactly. It’s that you have to consider normalization of deviance when designing systems: How will these rules interact with how people naturally bend the rules?

Maybe you need to make these things explicit in your designs. Like “We can launch down to 39F based on our tests, but if we push that down to 30F we’ll need to do more research to make sure it’s safe in the long run.”

The really sad, scary thing is this kind of normalization of deviance problem didn’t just cost the space shuttle program one orbiter. It cost it TWO. Because 17 years after the Space Shuttle Challenger disintegrated on liftoff, the Space Shuttle Columbia broke up on re-entry.

It wasn’t the solid rocket boosters and their O-rings, it was the insulation on the external fuel tank. It had to be covered in insulation to prevent ice from forming on it, and damaging the tank. But on take-off, the foam often fell off. It was relatively lightweight and didn’t usually cause any problems when it struck other parts of the orbiter. It’d even happened before the Challenger disaster, back in 1983. It was just “foam shedding,” as they called it. A now normal part of launch, even though no one had ever planned for it to happen. And this didn’t cause a problem, the first 112 times they launched.

But on the 113th time, a chunk of foam the size of a suitcase hit the wing in a spot where they couldn’t afford to be hit. And it turns out that even relatively lightweight foam can make a big hole when it hits the wing while the orbiter is moving at Mach 2.46. And they made it into space just fine, completed their mission in space just fine, but when they tried to re-enter the wing-edge temperatures of 2500F caused a failure of the structural components, as hot air entered through the hole caused by the foam block. The foam shedding thing was always a problem. It’d always been a danger to the orbiter, and had been there from the beginning. But they’d gotten lucky 112 times in a row. So they didn’t consider it a priority.

If they realized exactly how this could have caused a complete mission failure, they might have prioritized finding a way to fix the foam shedding. But it’d never been a problem before, so there’s always higher priority issues.

And that’s an element everyone building anything should consider: Your system not breaking doesn’t mean it works and is a solid design. It might just mean you’ve gotten lucky, a lot, in a row.

A real-world way I’ve run into this: at a previous job we had some tests that ran on machines, and they had a step where they’d install some special tools to run the test with, then use them. It turned out the “did we install the tools right?” part was always skipped. But no one noticed, because usually the tests would install fine, and if they didn’t, we’d immediately fail as the test tools weren’t there (or were the wrong ones). So we’d get the expected success or the expected failure. Seems to work fine, right?

And then one day someone makes a change in some unrelated code to try to limit how much we re-initialize test machines. We’ll leave some files in place so we don’t have to reinstall them all the time. And that affected these tools, too. And everything seemed to continue to work. We’d happily install the tools over the old ones, and it’s fine. But then someone accidentally broke the tools with a bad commit… and we didn’t notice for weeks.

Why? Well, the tools were broken now, and wouldn’t compile and install. Which’d be fine and would have triggered failures, except we had that long-standing bug (that we didn’t know about) where failing an install would still continue to run the test. And this had previously always triggered a test failure (due to missing tools) in the past, so we never had any issues with it. But now that we were keeping old files around, it meant the test would still run, as it’d use the files still on the box from the last time it’d worked.

So we thought we were running the new tool code, we thought the tools were working, we thought the tests were fine. We were wrong. Nothing was working. But we were lucky, so it looked like it did.

In the end we only discovered this was happening because we tried to set up some new machines. They naturally didn’t have any tools from the last-run (because they’d never run before) so they were failing tests that “worked” elsewhere.

Another reason why this had happened was because of the size of the log files we generated and all the scary-sounding stuff that happened in them. We had built a system that generated thousands of lines of logs for every test, with lots of “failures” recorded in them. Things like “tried to initialize FOOBAR_CONTROLLER: FAILED!!!,” but we just ran that code on all machines, even ones without the FOOBAR_CONTROLLER hardware. So no one noticed when another 5 lines of errors popped up in a 2000-line log file.

Because here’s the thing: most of the time when there’s a Serious Problem™, it’s not just one event. Disasters aren’t caused by one small event: it’s an avalanche of problems that we survived up until now until they all happen at once.

Like, the Titanic disaster didn’t kill 1,500 people because they had a one-in-a-million chance of hitting an iceberg. Yeah, the iceberg was the linchpin in that disaster, but it’s just the final piece in that jigsaw.

If they hadn’t been going so fast, if the radio operator hadn’t been preoccupied, if the lookout’s binoculars hadn’t been missing, if it hadn’t been a moonless night, if they’d not had rivet problems, if the bulkheads went all the way up, if they had enough lifeboats … It might have been a minor enough incident that you wouldn’t have even heard of it.

Like, in 1907 the SS Kronprinz Wilhelm rammed an iceberg. It was a passenger liner (later a troop transport) and fully loaded would have over a thousand passengers and crew aboard. It survived. It completed its voyage and stayed in service for another 16 years.

You probably haven’t heard of this incident. It’s a single line mention in a wikipedia page. Because they didn’t hit all the failures at once. They rolled the same dice and didn’t come up all 1s.

Maybe they were going slower, maybe they had more lookouts, maybe they had better steel rivets, maybe they just happened to hit an iceberg on a full moon so they had more time to notice they were going to crash and could slow down more. I don’t know.

And the thing about this kind of thing is that these sorts of disasters aren’t just mechanical or natural. This happens to people, too. I was talking to a friend the other day about their situation and we talked about this exact thing: It’s never just one thing.

It’s not like you get yelled at online or a friend is having difficulty and you go from “doing fine” to “nearly suicidal” in one step. No, it happens when all these things accumulate and coincide.

Your friend is going through a hard time and you’re trying to help, and normally that’s fine, but it happens on the day when you’re getting over a cold and your roommate is yelling at the cat and you get an unexpected bill and your fiancee is out of town. Each of these things on their own (or maybe with one or two others) is not a huge problem. You don’t have a breakdown. You don’t have a panic attack. But sometimes the dice come up the wrong way and all of them happen at once.

And I think the moral of the story is that you shouldn’t feel bad about getting pushed over the edge by a “little thing,” nor should you get mad at people for not being able to handle “a little thing.” Because it’s usually not that someone wakes up on a perfectly fine day, healthy and happy, and step outside their door and get hit by a car, and the day goes from GREAT to SHIT in one step. It’s usually lots of little things that accumulate. And you don’t realize each of them piling on until you reach that limit. You especially don’t realize it when it’s someone else hitting that limit!

So give people slack. Be understanding when you ask them to do one thing and they can’t get to it or it causes them stress that you don’t understand. They have other shit on their plate that you can’t see.

And that goes especially for YOU. Give yourself the benefit of the doubt on these things. Too often I see people being mean to themselves in a way they’d never treat anyone else. Be nice to you. You gotta live with you.

When you’re feeling mad at yourself or down on yourself, think about how you’d treat a friend in that situation. You probably wouldn’t go “you idiot, you can’t do anything right, why are you such a mess?” But it’s not uncommon for people to think that about themselves.

In any case, the only real helpful suggestion I can give for these kind of “overload” problems: it’s fine to not address the one that “caused” the issue. It may be the one that pushed you over the edge, but that doesn’t mean it’s the easiest or most important to fix. If you can’t do X because A+B+C+D+E being on your plate has overloaded you, it doesn’t mean you have to directly attack X to fix it, or even the most recent problem (E). You can look at all the problems and find which can most easily be fixed.

Think of it like a video game inventory system. You found a gem and a rusty sword and a health potion, but now you found a key and you don’t have room in your backpack. You definitely need the key, but that doesn’t mean you have to break down and fail the mission. And it’s not the key’s fault that you got overloaded. It’s not even the potion’s fault, being the latest thing. You can look at all your problems and find the one to fix. Maybe that’s the rusty sword, freeing up a bunch of space in one move. Or maybe it’s just the gem: something small and lightweight, but it’ll free up just enough room for a key.

So maybe the answer is “ask your roommate to put the cat in the time-out room for now so they’ll stop scratching them,” so you can handle making that phone call to the vet. Maybe you need to go to the pharmacy and get some cold meds.

My point is just that you can become overloaded like a video game character over their weight limit. When you have Too Much and it’s a problem, you don’t have to Just Bear It and inch your way back to the store to sell all your dungeon loot. When you’re overloaded, reshuffle what’s overloading you and find which ones you can relieve, even if it doesn’t seem to be a direct solution to your problem. Because you’ll have a lot better success in getting things done once you have some capacity to deal with things.

This sort of thing is sometimes called “spoons theory” when it’s related to disabilities. The basic idea is that you have some number of “spoons” you use up during the day on each thing you have to do. Disability means you have to use some of the spoons on the disability.

So you might have 5 spoons and you spend one on work, one on school, one on shopping, and have 2 free for anything else you have to do that day. But with disability you might be spending one every day just on the disability. And on a bad day, you have to spend two or three on it. and now it seems like the normal stuff that normally you have time and energy for, you can’t, because you’ve run out of spoons.

And it’s too easy to read “disability” and think “missing a leg” or “chronic illness like lupus.” Disabilities can be in your head just as easily, because that’s where YOU are. Depression is one. Anxiety, PTSD, ADHD, OCD … there’s plenty of illnesses that can use up spoons.

And maybe you think you’re doing fine, and your friends and coworkers think you’re fine, because you’ve got that spoon to afford on your disability. You compensate, and it works out. And then it’s a day where everything else is going wrong for random chance reasons and now you can’t afford that spoon and it seems like you’re failing and having a breakdown. It doesn’t mean your illness wasn’t there until then and just suddenly affected you! It just means you reached the point where you couldn’t afford your coping mechanisms, because you were overloaded.

It reminds me of how people with retina damage from lasers can have lots of it and it not seem to affect them very much, and sometimes they don’t even notice it because your eye already has a big blind spot, and your visual system works hard to make it seem like it’s not there. It fills in the blank bit. You get too close to a laser without proper eye protection, and now you have retina damage, but what’s one more hole to cover up? So your vision fills in the gap, and then you get more damage, and more, and it keeps filling in, but the total amount you can see is slowly going down, and your vision is worsening. Eventually your vision can’t compensate.

And it’s the same thing with mental illnesses: You cope. You spend spoons on making up for the problems they cause. You may stay functional … but you are spending spoons. You don’t have an unlimited budget.

So think about your workload (and by “work” I don’t just mean the 9-5 money-making sort of work). You have limits. And it’s not a bad thing when you have to cut back, when you have to relax, when you have to take time to heal. Because it often seems to be the nature of how we normalize what we’re successfully doing to keep pushing ourselves and not realize how close we are to being overloaded.

There’s nothing wrong with trying to avoid that point, and there’s especially nothing wrong with having to cut back on what you can do once you do hit that point. If you try to load 9 boxes in your car and only 7 will fit, you don’t get mad at the car for not “toughing it out.” You’re a machine with limits too. Those limits are different because you’re conscious and biological rather than computers and mechanical, but you’ve still got limits. Keep that in mind.

 

June 20, 2017

Uncovering the Sounds of SkiFree

Filed under: Game Design — foone @ 9:27 pm

ski

So I was looking into the disassembly of SkiFree (like you do), the classic skiing (pronounced she-ing) game for Windows 3.1 as part of the Microsoft Entertainment Pack #3. I noticed an odd string constant: “nosound”. The game checks for a command line parameter to turn off the sound, which is a very strange thing for a game with no sound effects or music to do!

I followed the disassembly to see where this option was being referenced, and to my surprise I found code to load sound effects, as well as trigger them at different points in the game. The only reason why the game appears to have no sound is that no WAV files were shipped in the EXE. So the obvious next step is, what happens if we edit some WAV files in? Will it play them as sound effects at the appropriate time?

Yes, in fact!

So I figured out which sounds meant what, by using a set of sounds that were just a robot voice saying “SOUND ONE” through “SOUND NINE”. I asked a musician friend to see if they could generate some appropriate sound effects. While I waiting for them to get back to me, I check the official site for information.

It turns out that back in 1993 the original creator of SkiFree (Chris Pirih) was working on a version 2.0 of the game, to add better physics, network play, sound effects, robot opponents, the works. Sadly they ran into difficulties with the physics implementation and then misplaced the source code, so that game was never completed.

However in 2005 they were able to locate a 1.03 version of the source, which is an intermediate step between the original 1.0 release and the planned 2.0 release. With this source they were able to build a 32bit version of the game, to allow it to run on modern systems (64bit windows is backwards compatible to 32bit programs, but not to 16bit programs like the original SkiFree). This version apparently included some of the improvements planned for 2.0, but no one had noticed the sound system for 12 years because it was effectively disabled by the lack of embedded sound effects. I was able to confirm this by partially disassembling the original 16bit 1.0 version, and found no trace of a sound system there.

I emailed Chris Pirih about my discovery, and to ask if they’d ever created sound effects to be used with SkiFree 2.0, and if they still had them. To my surprise, they emailed me back within a few hours, providing a full set of sound files for SkiFree!

sfx

They also included a header file which mapped the sound effect names to the resource numbers, to allow me insert them in the correct order. So I updated my earlier test with the new sounds, and created a new demo video of it in action.

The new sounds don’t play entirely properly, because SkiFree is using a very limited API to play back sounds. It only supports playing one sound effect at a time, so often sound effects will be cut off because another sound effect plays after it. The “argh” sound of the yeti is particularly cropped, because it’s a long sound that is cut off every time the yeti takes a step.

I’ve briefly looked into fixing this, but it’ll require switching to a different sound API, as SkiFree uses sndPlaySound, a very simple high-level API which can’t be used for any sort of sound effects that require mixing. That’s beyond the scope of my quick-hackery for today, it’ll probably require adding in an external wrapper DLL to manage the sound system. Hopefully that can be done without too much effort, I’ll have to see.

As for these sounds, I suspect some of them are placeholders that would have been replaced before SkiFree 2.0 was complete. It’s still neat to see them in action, even if they may not fit in perfectly.

If you’d like to try out this version of SkiFree with the sounds edited back in, you can download ski32sounds.zip. I’ve made no code changes in this version, just added the sound resources. This should work fine on both 32bit and 64bit windows systems.

One final thing to consider about sound in SkiFree: Even if the original version had been intended to have sound, it wouldn’t have been possible for two reasons:

  1. The first version of Windows to include a sound API was the Windows 3.0 with Multimedia Extensions OEM-only release, which came out a month after SkiFree 1.0
  2. There was no room on the disk. SkiFree 1.0 was released in an entertainment pack that included 7 games. Adding sound effects could have easily quadrupled the size of the game, requiring an extra disk or kicking another game out of the collection.

 

August 1, 2016

MURIDOS Devlog 4: Backgrounds 2

Filed under: MURIDOS — foone @ 11:11 pm

The next step: Investigate the palettes being generated and see if there’s something going wrong there.

I completely rewrote the palette generation (not hard when it’s just a big unix command line), reexamined all the images, and compared against the existing palettes: perfect match. I double checked the palette loading code, to make sure there’s no issue there, with swapping channels or some similar issues. Nope, everything looks kosher.

So let’s pretend we don’t have a palette issue and move onto the next step: DUM loading, DUM being the custom image format I created for this project. It’s very simple! 5 bytes of header, giving it a unique identifier (D8 for “Dum 8-bit”) and width/height, followed by zlib-compressed image data.

This is a straightforward if tedious process. Using the 5-lines of python that comprise the encoder as a reference, I write 70 lines of simple C++ to load DUM files. And then run the prototype to test it, and hey, look!

layers2

Suddenly our colors look a lot better from last time. My hunch was accurate: something about the TGA conversion process or TGA loading process was corrupting the colors. DUM has no conversion, and therefore has no problem.

The only remaining issue with backgrounds is that the loader needs to be expanded to be more efficient. Currently it loads the entire image into RAM, creates an equally large allegro image, and does all the decompression in one shot. This requires more ram that is necessary, and won’t work on real DOS hardware, where we may not be able to access all the video ram at once, thanks to bank switching. For this early test it’s fine, but it may become an issue in the future once we start running on real DOS hardware.

For next time, we’ll look into loading objects into our empty levels.

July 28, 2016

MURIDOS Devlog 3: Backgrounds

Filed under: MURIDOS — foone @ 9:21 pm

We’ve got a blank window with correct palette, so what’s the next step? Backgrounds!

It’s not as simple as a plain background, like a single image. Or even a single layer of tiles! no, this is a multilayer system. The first step is converting the output of the room-dumper script into something that’s easy to load with C++ code.

 -50: x=90, y=38:  0, 0:bg_tiles1
 -50: x=90, y=39:  1,16:bg_tiles1
8999: x=50, y=32:  2,20:bg_tiles1
8999: x=50, y=33:  6,20:bg_tiles1

This means “layer at priority -50,  coordinates 90,38, draw tile at 0,0 from the bg_tiles1 set”. But thankfully the background tiles are all the same size (16×16 pixels), so we don’t have to encode that.

The scheme I ended up using, for tiles:

rect4290

a 16bit integer, with the first 4 bits indicating which of the 5 tilesets are used. The next 4 bits indicate the x-coordinate (The tilesheets are 8-12 tiles across), and the next 8 are used for the y-coordinate. Naturally these are tile coordinates (so they’re interpreted as times 16) instead of pixel coordinates.

Then those raw blocks are grouped into rectangles, which have a width/height and an x&y offset. This will be more efficient than having all the layers be the same size, as most layers will only encompass a small subset of the map, and we don’t want to waste time processing empty tiles. We just need to sort the layers so that they render

So here’s a display of level 1-1, with the layers outlined:

layers

It’s as simple as that! we go through the layers, blitting out of the selected tiles to the screen (eventually backbuffer).

The only problem so far is that those colors are completely and totally wrong. And some layers seem to be missing. More investigation is needed!

July 27, 2016

MURIDOS devlog 2: Initial code

Filed under: MURIDOS — foone @ 10:30 pm

So now that we’ve got all the resources helpfully extracted, it’s a matter of putting them to use. We have all the code from the original game, but that’s not yet helpful;

  1. It’s not a language we can directly use (it’s Game Maker 7 code) or emulate
  2. It’s only half the code of the game. It handles interaction between objects and when to spawn them, but all the code that draws them or detects collisions, draws backgrounds and sprites, plays sound and handles input? That’s all built into game maker.

So let’s put the code aside for now. We need to build something from scratch, and we need to use it in DOS. This limits our options. Running SDL/pygame/opengl? out of the question!

So I’m going to be using Allegro. Specifically, Allegro 4.2 as that was the last version that supported DOS. This gives us much of the functionality we’d get out of something like SDL, and we can compile for DOS using DJGPP. We can also compile for Linux/Windows natively, which means we can easily develop without having to keep a DOS VM on hand.

Now using Allegro with DJGPP means we’ll be targeting a 32bit extender (so a 386+) and VGA graphics. This may seem like overkill for the type of game MURI is in tribute to, and it is. MURI is definitely inspired by 16bit EGA games like Captain Comic, Commander Keen, and Duke Nukem. But the reality of how this game will be played is that it’s most likely going to be in DOSBox or in (relatively) modern DOS gaming machines. So neither 386+ or EGA will really hold us back in terms of what computers can play it. It’s a minor inaccuracy, but one we’ll have to live with.

So to begin with, we start with a simple allegro example. We set up graphics and keyboard, display a simple screen, and exit. The first things added are loading the palette:

palette

Which has several similar but not exactly identical colors. This is another reason we’re not targeting real EGA: EGA is limited to 16 colors and we have 26. We could merge some to get it down to 16, but even then we have the problem that these aren’t the 64 EGA colors.

So if we’re using VGA, we have 256 colors to work with, so there’s no reason to not use the colors as-is.

Next time, we get backgrounds working!

July 26, 2016

MURIDOS devlog 1: Resources

Filed under: MURIDOS — foone @ 7:30 pm

So, extraction should be done. I’ve got all the original art (objects and backgrounds), levels, sounds, and code extracted out of the MURI Game Maker file. This should have been a simple task, but Game Maker 7 is a mess.

There’s a “save all source” option, but it gives you the source of all user-defined functions and nothing else. There’s also plenty of source associated with event handlers, which have to be manually extracted. I said “screw it” to the manual process and wrote a script to extract them by automating opening all the various windows and clicking on all the buttons.

As for images, backgrounds were simple as there’s only a few and they can be done manually. Object images was much harder, and required automation again to extract them.

For extracting level data I went the opposite route and modified the source, so that when it loaded a new level it also created a pair of text files listing where all the tiles were and their properties, as well as the objects loaded into the level.

Sounds were done manually, although they’re temporary. MURI uses WAV soundfiles for the effects, done in a style to emulate the PC speaker. Well, I’m making a DOS port. I happen to have a PC speaker! So I’ll need to reverse engineer how they work, and encode that into an equivalent series of instructions for the PC speaker. Time to break out the (virtual) oscilloscope!

There was also a lot of cleanup to make later steps simpler. Unifying all the files into 256-color images with the same palette (the raw images are true color, even if they only ever use ~20 colors). I created a very simple image format called DUM to easily encode the images, as no existing image formats really fit my needs. It’s basically just a header and gzipped pixels, no palette or other metadata.

(Most of this was done nearly a year ago. I’m just now getting back to this project)

July 17, 2015

Silent installation of JDK 8 on windows

Filed under: Uncategorized — foone @ 6:39 pm

This doesn’t appear to be documented or posted anywhere I could find with google, but this is the insane syntax you need:

jdk-8u51-windows-x64.exe /s ADDLOCAL=”ToolsFeature,PublicjreFeature” /INSTALLDIRPUBJRE=”C:\java\jre” /L “C:\java.txt” INSTALLDIR=C:\java\jdk

INSTALLDIR existed in Java 6/7 but had a slash before it. For some reason in JDK 8, you must do it slashless.

(If you wonder “Why install the JRE/JDK to C:\Java instead of C:\Program Files”, the answer is because You’re Running Jenkins and vcvarsall.bat hates you)

(Also: If you want to wait while the JDK installs, do “start /wait jdk.exe ..”. but if you’re in cygwin you can’t, because start isn’t a program, it’s a command inside cmd.exe, so do “cmd.exe /c start /wait jdk.exe …”)

July 12, 2011

Backstory of Starmap

Filed under: Uncategorized — foone @ 11:56 am

The Modified Neutrino Theory was published by $NAME(indian) in 2062, and the first example of the Catalyzed Neutrino Decay reactor was successfully built by $UNIVERSITY in 2073. By the end of the century, the world was barely recognizable. CND generators provided nearly limitless energy and were able to be miniaturized small enough to power an electric car. With the vast majority of the planet’s energy needs switched over to CND, pollution from power plants, automobiles, and airplanes was a thing of the past. Lightweight autonomous rigid airships using modified neutrino decay devices to heat the air within their envelope cluttered the sky, transferring cargo that once was delivered by truck, train, and ship.

The only area of technology not revitalized by the discovery of this new energy source was space exploration. Because CND generators work by using the curvature of space near massive objects to redirect and capture neutrinos, CND generators only worked up to low Earth orbit, beyond which they could only supply a trickle of power, decreasing to a barely measurable amount once a spacecraft left orbit.

Despite the setbacks CND generators (and the abandonment of the partially completed Space Solar Power Array which was being built at L1), in 2126 a massive project was undertaken to build a moon base. The initial phase involved launching hundreds of small mining robots to the moon using a mass accelerator built up the Serra do Imeri mountain range in Brazil. The robots were powered by CND devices which were useless in transit but reactivated once they reached the moon (although at 1/6th the power level they would have had on Earth). The first goal of the project was building a massive network of large CND devices under the Moon’s surface, to provide power for the automatic factories and pressurized domes which would be built later.

It was shortly before the completion of the Lunar CND Array that the Periodic Neutrino Anomaly was detected. Every 24.3 days, there were two bursts of neutrinos 68.3 minutes apart. Both their polarity and the sequence in which they hit the Lunar CND Array showed that they couldn’t possibly be solar in origin, and after statistical analysis of stationary CND devices on Earth they were discovered there too. After parallax analysis, it was determined that they were coming from a point near L4, and the partially completed Wide Lunar Array telescope was used to example the area for the next anomaly.

It was the discovery that defined Human history even more than the invention of CND: A small spacecraft of alien origin was repeatedly appearing, deploying solar sails, and then disappearing 68.3 minutes later, having returned to approximately its starting point. Radio and PNC (Polarized Neutrino Communications) signals received no response or reaction from the spacecraft, believed to be an observation probe of some sort. The ESS Armstrong (built for the lunar colonization project) was quickly re-purposed for a mission to observe the probe up close. 68 scientists from 12 nations arrived in the general area of the probe (there was some uncertainty as to where it would appear, as observations of the last 13 cycles showed the probe always appeared within a disc 2km in diameter.

On close inspection, the alien object was confirmed to be some sort of observation probe, but all the telescopes and sensors that could be identified were pointing back towards the disc, none towards Earth or any of the other planets. While maneuvering to better observe the disappearance and eventual reappearance of the probe, all contact with Earth and the lunar outpost suddenly ceased. After a few paniced minutes, contact was re-established over radio. Somehow all neutrinos were being blocked in a cylinder-shaped area beyond the disc. Even solar neutrinos weren’t detectable within this area. The scientists on the ship and on Earth were no closer to finding an answer to how this was possible or why it was happening when the ship reappeared. The ESS Armstrong continued observing it from within the cylindrical area (labeled the L4 Neutrino Anomaly), but something entirely unexpected happened when the probe disappeared again, right on schedule: The Armstrong vanished too.

To be continued at some point.

March 31, 2011

March Game: Pipecraft

Filed under: Month Games — foone @ 11:35 pm

Pipe Mania?
For my March game I made a sort of Pipe Mania inspired game, using CraftyJS, a very neat library I sorely underutilized in this game. It uses a component-based design which I had some problems with early on, so I basically just wrote around it (because I was low on time) rather than figuring out how I should really use it. I’m hoping to correct that for my next CraftyJS game. I also made use of CoffeeScript, which is an excellent language that’s really just a sanity wrapper on JS. It fixes some JS misfeatures and changes the syntax to something that’s more like Python/Ruby. It compiles JS pretty directly, though for-loops end up looking pretty weird.

I also used some of my Tasari sprites, which is a series of unfinished RTS games I worked on from around 1998-2002. (These sprites are so old they were originally drawn for a Visual Basic 5 game!). Tasari 1 was a VB RTS, Tasari 2 lasted about 20 minutes and was a C++ translation of Tasari 1, and Tasari 3 was a fully 3D mess that got nowhere because I didn’t understand model formats, so I just hardcoded all the models into the source. It was deeply ugly.

As for the gameplay itself, it’s pretty simple Pipe Mania with some tweaks in strange directions: You place randomly selected straight/90-degree turn/crossover pieces, and if you replace an existing piece there is a time penalty. You have a time limit, and one of the ways I tweaked it was adding a “sink” tile. Instead of having to create a series of pipes that will survive for Xty seconds/Xty tiles, you have to connect from start to finish within the time limit + travel time of the water/electricity/flooz. It also has levels (A whole 3 of them, including a tutorial level!) instead of just being a blank grid with increasingly difficult time limits.

I’m not sure if it’s the different library, the fact I know JS better now, the lack of multiplayer, or the calming influence of CoffeeScript, but unlike my last JS game I don’t feel like pulling my hair out. (My last JS game was last March, which shows you how much I hated it: It took me a whole year to get back into JS)

PS: I called it “Pipecraft” because I’d developed it in my CraftyJS demo folder, which I’d just named “craft”. Since I’ve played too much Minecraft and based it on Pipe Mania, Pipecraft it is.

« Newer PostsOlder Posts »

Blog at WordPress.com.