http://www.cnn.com/ELECTION/2008/primaries/results/state/#IA
Most TV is more focused on what a caucus is than on current results… I’m just refreshing this page a lot.
http://www.cnn.com/ELECTION/2008/primaries/results/state/#IA
Most TV is more focused on what a caucus is than on current results… I’m just refreshing this page a lot.
Iowa Caucus today. I’m glued to my computer. But it hasn’t even started.
This article mentions some interesting scenarios. One is that Edwards has been campaigning like mad in Iowa for a long time, so some are suggesting that he might walk away in first place. But Obama’s camp is also expecting a huge turnout: if we can get a flood of young voters to go to the Caucus, Obama’s a shoe-in. The article even mentions that it wouldn’t really be so surprising if Hillary, generally considered the front-runner, comes out in third place. The polls have started contradicting each other. One December 30 poll shows Obama winning slightly, another shows Clinton winning slightly. It’s all within that margin of error, and, on top of the margin of error, you have to wonder about who’s going to show up at the Caucus.
Giuliani is playing his cards… strangely. It looks like he’s blowing off Iowa again. He’s behind even Thompson in Iowa. I’m still surprised that Huckabee is doing so well in Iowa. He and Romney are duking it out there. (And, while I have major issues with a candidate who proclaims that he’s going to recapture our nation for Christ, I think I’d favor him over Romney.)
New Hampshire’s a bit interesting, too. Averaging polls, and mixing gut feeling in, it looks like Clinton enjoys a slight lead over Obama, and both of them are out in front of Edwards. But I think Iowa’s going to play a big role. If Edwards does really well in Iowa, that may bring him success in New Hampshire. Of course I’m crossing my fingers for Obama.
The Republican front gets interesting here, because the candidates who are polling favorably here aren’t the same ones in Iowa. Giuliani isn’t doing too well here, either–and I recall a recent article suggesting that, the more he campaigned here, the more his numbers dropped–but he’s doing better than in Iowa. Huckabee here falls tremendously, though, to a mere 9%. The two big guys here are Romney and McCain. McCain was actually leading in the most recent poll, although a poll a few days earlier said the same about Romney.
South Carolina’s being called another bellwether state. They have split primaries: the R’s go the 19th and the D’s go a week later. The South Caroliners show no love for their neighbor to the North, John Edwards, who’s polling at 17% pretty consistently. Here, Obama and Clinton are also neck-and-neck, although what’s interesting is that it looks like Obama has been closing in: in previous surveys he wasn’t nearly as close. On the Republican front, they’re quite fragmented: Giuliani, McCain, Romney, and Thompson are all pretty close. Huckabee enjoys a significant lead here, with 28% of the vote. I’m thinking that Iowa and New Hampshire might shake things up a bit: Thompson and McCain aren’t looking viable in the first two, so perhaps their supporters will get behind another candidate.
Before South Carolina, though, we have Michigan. They haven’t been getting polled that often, though. It looks like Romney and Huckabee are the two big guys there. We have to go back to November to see Democrat results, but it looks like Clinton has a significant lead in Michigan. And then there’s Florida, where Giuliani leads, with Huckabee and Romney essentially tied for second. Hillary seems to enjoy a significant lead in the Democratic race.
Don’t get too caught up in the need for instant gratification watching who wins, though. The next week is going to be exciting, and then there’s Super Tuesday (or Super Duper Tuesday as it’s now being called), with over 20 states holding primaries the first Tuesday in February. But we can’t just wait until February to know: as the map shows, a sizable number of states have later primaries. Montana and South Dakota are off in the Twilight Zone, holding primaries in June. (Think they’ve had a lot of candidate visits? Then again, think they’re getting a lot of calls?) The DNC is at the end of August, and the RNC is the next week, starting off September.
I tend to think of web hosting in terms of many sites to a server. And that’s how the majority of sites are hosted–there are multiple sites on this one server, and, if it were run by a hosting company and not owned by me, there’d probably be a couple hundred.
But the other end of the spectrum is a single site that takes up many servers. Most any big site is done this way. Google reportedly has tens of thousands. Any busy site has several, if nothing else to do load-balancing.
Lately I’ve become somewhat interested in the topic, and found some neat stuff about this realm of servers. A lot of things are done that I didn’t think were possible. While configuring my router, for example, I stumbled across stuff on CARP. I always thought of routers as a single point of failure: if your router goes down, everything behind it goes down. So you have two (or more) routers in mission-critical setups.
One thing I wondered about was serving up something that had voluminous data. For example, suppose you have a terabyte of data on your website. One technique might be to put a terabyte of drives in every server and do load balancing from there. But putting a terabyte of drives in each machine is expensive, and, frankly, if you’re putting massive storage in one machine, it’s probably huge but slow drives. Another option would be some sort of ‘horizontal partitioning,’ where five (arbitrary) servers each house one-fifth of the data. This reduces the absurdity of trying to stuff a terabyte of storage into each of your servers, but it brings problems of its own. For one, you don’t have any redundancy: if the machine serving sites starting with A-G goes down, all of those sites go down. Plus, you have no idea of how ‘balanced’ it will be. Even if you tried some intricate means of honing which material went where, the optimal layout would be constantly changing.
Your best bet, really, is to have a bunch of web machines, give them minimal storage (e.g., a 36GB SCSI drive–a 15,000 rpm one!), and have a backend fileserver that has the whole terabyte of data. Viewers would be assigned to any of the webservers (either in a round-robin fashion, or dynamically based on which server was the least busy), which would retrieve the requisite file from the fileserver and present it to the viewer. Of course, this places a huge load on the one fileserver. There’s an implicit assumption that you’re doing caching.
But how do you manage the caching? You’d need some complex code to first check your local cache, and then turn to the fileserver if needed. It’s not that hard to write, but it’s also a pain: rather than a straightforward, “Get the file, execute if it has CGI code, and then serve” process, you need the webserver to do some fancy footwork.
Enter Coda. No, not the awesome web-design GUI, but the distributed filesystem. In a nutshell, you have a server (or multiple servers!) and they each mount a partition called /coda, which refers to the network. But, it’ll cache files as needed. This is massively oversimplifying things: the actual use is to allow you to, say, bring your laptop into the office, work on files on the fileserver, and then, at the end of the day, seamlessly take it home with you to work from home, without having to worry about where the files physically reside. So running it just for the caching is practically a walk in the park: you don’t have complicated revision conflicts or anything of the sort. Another awesome feature about Coda is that, by design, it’s pretty resilient: part of the goal with caching and all was to pretty gracefully handle the fileserver going offline. So really, the more popular files would be cached by each node, with only cache misses hitting the fileserver. I also read an awesome anecdote about people running multiple Coda servers. When a disk fails, they just throw in a blank. You don’t need RAID, because the data’s redundant across other servers. With the new disk, you simply have it rebuild the missing files from other servers.
There’s also Lustre, which was apparently inspired by Coda. They focus on insane scalability, and it’s apparently used in some of the world’s biggest supercomputer clusters. I don’t yet know enough about it, really, but one thing that strikes me as awesome is the concept of “striping” across multiple nodes with the files you want.
The Linux HA project is interesting, too. There’s a lot of stuff that you don’t think about. One is load balancer redundancy… Of course you’d want to do it, but if you switched over to your backup router, all existing connections would be dropped. So they keep a UDP data stream going, where the master keeps the spare(s) in the loop on connection states. Suddenly having a new router or load balancer can also be confusing on the network. So if the master goes down, the spare will come up and just start spoofing its MAC and IP to match the node that went down. There’s a tool called heartbeat, whereby standby servers ping the master to see if it’s up. It’s apparently actually got some complex workings, and they recommend a serial link between the nodes so you’re not dependent on the network. (Granted, if the network to the routers goes down, it really doesn’t matter, but having them quarreling over who’s master will only complicate attempts to bring things back up!)
And there are lots of intricacies I hadn’t considered. It’s sometimes complicated to tell whether a node is down or not. But it turns out that a node in ambiguous state is often a horrible state of affairs: if it’s down and not pulled out of the pool, lots of people will get errors. And if other nodes are detecting oddities but it’s not down, something is awry with the server. There’s a concept called fencing I’d never heard, whereby the ‘quirky’ server is essentially shut out by its peers to prevent it from screwing things up (not only may it run away with shared resources, but the last thing you want is a service acting strangely to try to modify your files). The ultimate example of this is STONITH, which sounds like a fancy technical term (and, by definition, now is a technical term, I suppose), but really stands for “Shoot the Other Node in the Head.” From what I gather from the (odd) description, the basic premise is that if members of a cluster suspect that one of their peers is down, they “make it so” by calling external triggers to pull the node out of the network (often, seemingly, to just reboot the server).
I don’t think anyone is going to set up high-performance server clusters based on what someone borderline-delirious blogged at 1:40 in the morning because he couldn’t sleep, but I thought someone else might find this venture into what was, for me, new territory, to be interesting.
Disclaimer: I can tell right now that this is one of those late-night posts where I should be sleeping, not posting about a technical topic. But these not-entirely-lucid ones are sometimes the most fun to read.
I consider myself extremely tech-savvy. I can build a computer from parts, make my own Ethernet cables, run some performance tuning on interactive websites, write applications in numerous programming languages (as well as SQL and HTML), and much more.
But I still don’t get our digital thermostat. They’re programmed to go down to 58 at night, come up to 67 on weekends and from something like 6 to 9 a.m., and 3 to 9 p.m. on weekdays. In other words, when people are home.
Of course, me being home on vacation isn’t quite compatible with this. There’s a simple override, where you can hit the up or down arrows to set it to a temperature. While I use (and appreciate!) this, it’s also a pain. It’s really no fun waking up and having it be 58. I’d really like to reprogram it to automatically come up to 63 or so around 10:30.
I still don’t get why the whole thing isn’t on the LAN. This would have two obvious benefits right out of the gate–it’d be much easier to configure (even if you let someone with no clue about usability design the GUI, it’ll be better than the myriad knobs, switches, and buttons on our thermostat!), and it’d be more convenient in many cases to pull up a new tab in your web browser than to walk down the hall to the thermostat. (Plus, the thermostat is in my parents’ bedroom. I’d have loved to have turned the heat up a few degrees around 11 tonight, since it’s 9 outside and almost as cold inside. But something tells me they really wouldn’t have appreciated it.)
I’m also not sure that the ‘simple’ thermostat algorithm is that efficient. You figure it works something like:
while(1) { $temp = getTemperature(); $desired = readDial(); if($temp<$desired) furnace.enable; if($temp>$desired) furnace.disable; }
When we view it at ‘computer speed,’ I think we can see one of the basic problems: in theory, the furnace could start flapping, where on one loop iteration it turns the furnace on, and just a fraction of a second later, it turns it off. I don’t profess to know a lot about the overhead in starting a furnace, but I’d imagine that it’s most efficient to let it run for a few minutes.
I think a much better system would be to have a programmed minimum run time: if the furnace is turned on, we should run it for at least 5 minutes. After 5 minutes, we again evaluate the temperature: if it’s at the target, we turn it off. If not, we drop into a quicker polling, maybe once every minute. Incidentally, this is much better for the thermostat’s processor, but if its sole purpose is determining whether to turn something on or off, no one really cares about minimizing overhead.
So you give it a secondary purpose: handling a TCP/IP stack and a basic webserver! All of a sudden, instead of an infinite loop, you run a tiny bit of code every 30 seconds.
You can also generate some interesting statistics. For example, how long does the furnace need to run to raise the temperature one degree? How does this scale–if you want to raise it three degrees, does it take three times as long? How does the temperature of my house look when graphed across a day? How about telling me how long the furnace ran yesterday? And, given information about my furnace’s oil consumption and our fuel costs, it’d be cool to see how much it’s costing. And it could give us suggestions: “If you drop the temperature from 68 to 67, you’ll save $13.50 a month,” or such. This would require some storage, but a gig of solid-state media (e.g., a camera’s SD or CF card) is around $10-20 now. Plus, with the advent of AJAX, you can push some of the processing off to the client–let the client use a Flash applet or some good Javascript to draw the graphs if the thermostat is underpowered!
In conclusion, I’m freezing.
As has happened in the past, it seems like the election has been reduced to 2 or 3 talking points–immigration, health care, and Iraq, to name the big ones.
I was brushing up on Obama’s stance on the issues, and found something that really excited me. Check out his page on ethics. It’s not vague talk about how lobbying is bad. He has an awesome plan:
It’s ambitious, but boy would it be awesome! It’s funny: it almost seems like it’s somehow wrong that I should be able to see exactly what my elected officials are doing. And yet it’s really exactly what our government is all about: transparency. Wow-a-wee-wow!
Why isn’t there a really good “network appliance” as a network gateway? You can get a low-end firewall/router, or you can build your own machine.
Setting up OpenBSD is no walk in the park, though. I want to build an “appliance” based on OpenBSD, and give it a nice spiffy web GUI. You buy the box, plug one side into your switch and one side into your cable modem or whatnot, and spend ten minutes in a web browser fine-tuning it. I was really fond of the appearance of the Cobalt Qube, although it could be made much smaller. And throw a nice LCD on the front with status. You can run a very low-power CPU, something like the one powering these. It really doesn’t need more than 512MB RAM, but give it a small solid-state drive. And a pair of Gigabit cards, not just for the speed, but because GigE cards usually are much higher-quality. In building routers, the quality of your card determines how hard the CPU has to work.
There’s so much that a router can do. You can run a transparent caching proxy, a caching DNS server, priority-based queuing of outgoing traffic (such as prioritizing ACKs so downloads don’t suffer because of uploads, or giving priority to time-sensitive materials such as games), NAT, an internal DHCP server, and, of course, a killer firewall. You can also generate great graphs of things such as bandwidth use, blocked packets, packet loss, latency…You can regulate network access per-IP or per-MAC, and do any sort of filtering you wanted. It could also easily integrate with a wireless network (maybe throw a wireless card in, too!), serving as an access point and enabling features like permitting only certain MACs to connect, requiring authentication, or letting anyone in but requiring that they sign up in some form (a captive portal). And I really don’t understand why worms and viruses spread so well. It’s trivial to block most of them at the network level if you really monitor incoming traffic.
I’m frankly kind of surprised that nothing of this level exists. I think there’s a definite market for quality routers. A $19 router does the job okay, but once you start to max out your connection, you’ll really notice the difference! A good router starts prioritizing traffic, so your ssh connection doesn’t drop and your game doesn’t lag out, but your webpages might load a little slower. An average router doesn’t do anything in particular and just starts dropping packets all over the place, leaving no one better off. (And a really bad router–our old one–seems to deal with a fully-saturated line not by dropping excess packets or using priority queueing, but by reboot itself, leaving everyone worse off… I think this may have had to do with the duct tape.)
First post of 2008!
Today’s bit of advice: before you wade too deep in going cross-eyed staring at routing tables that seem to be good, trying to figure out why half of your LAN is unreachable, check to see if the freaking second network cable is plugged in.
Posting this in the hopes that it’ll help someone at some point….
Using Apache (Apache2 in my case, but I’m not sure it matters), you can customize the format for log files like access_log. Apache has a good page describing the variables you can use. But it doesn’t tell you everything you need to know!
The first question is where you put it… You can just specify it in httpd.conf (I put it near the end, but I don’t think its placement matters terribly, as long as it’s not in the middle of a section. It doesn’t go in any directives or anything. You can also insert it inside a VirtualHost directive if you only want it to apply to those. (Don’t put it inside a Directory directive!)
The second thing is something that’s not really specified anywhere: specifying a LogFormat without then specifying a CustomLog directive accomplishes nothing! I wanted to keep Apache logging in the default directory (/var/log/apache2/access_log on Gentoo), so I just set the LogFormat to something I wanted. And nothing happened.
You specify the format in CustomLog as well, so it’s handy to use LogFormat to assign a “nickname”:
LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-agent}i"" n1zyy
CustomLog /var/log/apache2/access_log n1zyy
The first line sets the “n1zyy” ‘nickname’ to refer to to the format I specify. The next line sets a “custom” log file (in this case, it’s the same as the default, but I digress. It won’t work if I don’t specify it.) Then I tell it to use the format named “n1zyy.”
Once this is set up, you want to reload Apache, since it won’t notice your changes until you do.
In the most recent polls, Obama is leading narrowly in New Hampshire. And it’s practically a banal phrase at this point, but Iowa is a crapshoot: the “big three” (Edwards, Clinton, and Obama) are pretty much tied. Right now it looks like Edwards is leading, which people thought was unlikely. Thus I’m not too worried at the moment about Hillary’s triumphs in other places.
But for the first time in a while, I’m feeling really excited. This could actually happen!
I’m starting to get interested in the Republican primaries as well: they’re seeming pretty fragmented. Romney and Rudy both have big leads over each other in many states, but McCain and Huckabee are notable contenders in some states, too. (Somewhat humorously, at least to me, Romney has a pathetic 7% in Massachusetts, although the poll is ancient. Someone ought to do a new poll of Massachusetts voters.)
Plans are still up in the air but I may well end up volunteering over at the Obama headquarters later today. The nation is watching us, and I don’t want to sit by idly in the process. We can do this!