Brian Klug: Internet Adventures
Give me technology, or give me death!
Give me technology, or give me death!
Feb 8th
Whether you like it or not, the big news today wasn’t the outcome of “The Big Game,” the 2010 Toyota Prius Recall, or the fact that Verizon is “deliberately” blocking 4Chan for wireless customers (though those last two are admonishable attempts by the respective companies to submarine news).
It was the fact that today, Google advertised its core search product on TV in a $2.6 million Super Bowl ad. Wait, did I just say Super Bowl? I meant “Big Game.”
Hell proverbially froze over, by CEO Eric Schmidt’s own admission.
But if you actually watch the video, and watch closely, you’ll notice that very little of the advertisement focuses on the search experience itself. In fact, it spends so much effort building trite emotional appeal that it completely neglects at least half of the front-facing search experience. In fact, what it disregards is a feature so neglected, even I didn’t realize it was completely passed over until I watched a parody.
First, watch the “Parisian Love” ad itself:
Now watch the brilliant parody “Is Tiger Feeling Lucky Today” by slate:
Disregarding completely the message, the search terms, what the so-called “story” was, did you notice how differently Google advertised their own product compared to how well Slate did? Slate used “I’m Feeling Lucky.” Google? Not once. In fact, doing so could have been absolutely brilliant in the context of the ad’s cheezy romance theme. Imagine “will she marry me” -> I’m feeling lucky.
So what that communicates is that even Google doesn’t know what the heck “I’m Feeling Lucky” is doing there. Ask yourself, when is the last time you actually used it? Is it easily accessible? Is it part of that seamless, effortless Google experience they talk about? Is it so essential a part of the search experience that if it was missing, some part of your being would be inexorably changed forever?
You get the point. It isn’t.
There’s nothing easy about using “I’m Feeling Lucky;” you can’t get to it with shift-enter or any other keyboard shortcut. It isn’t natural; everyone’s so used to just hitting enter or using the browser search bar. I ask then what purpose it’s serving.
For my answer, I googled. I didn’t use “I’m Feeling Lucky” :
The “I’m Feeling Lucky™” button on the Google search page takes you directly to the first webpage that returns for your query. When you click this button, you won’t see the other search results at all. An “I’m Feeling Lucky” search means you spend less time searching for web pages and more time looking at them. -Link
Oh really? That’s, you know, awesome, but isn’t diving head first into the first result of some search query just as dangerous as using link shorteners? As opening links in email blindly? As bad as everything we’ve always taught people not to do? Moreover, isn’t randomly guessing kind of a bad algorithm for mentally sorting through search results? I mean, if you use “I’m Feeling Lucky,” you’re going to have to come all the way back out to the front to re-submit your query. What’s elegant, beautiful, or simple about that?
Take a step back and think about the name of that button as well. What does “I’m Feeling Lucky” imply? Why the need for obscurity? Why not just call it “First Result” or “Dive In Blindly!™” or something else that’s approachable and friendly?
Years ago, the first time I clicked this, I half expected to be taken to some sort of contest entry form.
We’ve all read a lot, and I mean a lot about how much time, effort and money Google pours into keeping their famously-lightweight homepage simple. They’ve evolved the design. They’ve removed things. They make it fade in slowly so those of us challenged by reading aren’t scared or overwhelmed. They count and have sleepless nights over the number of words on it!
Oh, I know what you’ll say, it’s part of their “corporate identity,” part of their “product,” part of what makes Google, Google. Nonsense; that’s the kind of talk that turns innovation into stagnation for the sake of consistency. My high school English teacher would be proud, because two of his favorite quotes apply directly to the kind of idiotic allegiance they have to that worthless button:
- A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. -Ralph Waldo Emerson
- Consistency is the last refuge of the unimaginative. -Oscar Wilde
For all of Google’s engineering talent, all that time, all those fancy positions, titles, and critical thought, they don’t realize that their biggest Sacred Cow is staring them in the face. That “Sacred Cow” is ‘I’m Feeling Lucky.”
C’mon Google, even you don’t use it or know why it’s there.
Feb 7th
If you’ve read my big post on the Zoneminder configuration I have at home, you’ll notice that I favored capture of JPEG stills over using MJPEG during initial configuration.
At the time, the reason was simple; I couldn’t make MJPEG work. I’ve now succeed in doing so, and understand why it didn’t work the first time.
I remembered reading something in the Zoneminder documentation about a shared memory setting resulting in capture at higher resolutions failing. Originally, when I first encountered the problem I decided that it was simply me getting something wrong with the path to the .mjpeg streams on the cameras, since I was more familiar with capture of jpeg stills from prior scripting.
However, I stumbled across some documentation here from another tinkerer, which also pointed to the memory sharing issue.
The problem is that the buffer of frames (usually between 50 and 100 for the camera) must be contained in memory for processing. If the size of the image:
Exceeds this shared memory maximum, you’ll run into errors or see the camera status go to yellow/orange instead of green. (It can get pretty confusing trying to troubleshoot based on those status colors unless you’re checking the logs… /doh)
In fact, the problem I was seeing was likely directly as a result of the large capture image size of my Axis 207mW, as they cite it directly:
Note that with Megapixel cameras like the Axis 207mw becoming cheaper and more attractive, the above memory settings are not adequate. To get Zoneminder working with a full 1280×1024 resolution camera in full colour, increase 134217728 to, for example, 268424446
/facepalm. I really wish I had come across this the first time around. Either way, you’re going to ultimately run into this problem with either higher framerate connections, color, or higher resolutions.
I followed the tips, here, but doubled them since the machine I’m running ZM has a pretty good chunk of memory available.
The process is simple. You’re going to have to edit /etc/sysctl.conf to include the following somewhere:
# Memory modifications for ZoneMinder (kernel.shmall = 32 MB, kernel.shmmax = 512 MB)
kernel.shmall = 33554432
kernel.shmmax = 536870912
Now, apply the settings with
sysctl -p
Which forces a reload of that file. Next, you can check that the memory parameters have been changed:
brian@brian-desktop:~$ cat /proc/sys/kernel/shmall
33554432
brian@brian-desktop:~$ cat /proc/sys/kernel/shmmax
536870912
Which is successful. You can also check it with ipcs -l. Now, reboot ZoneMinder and you shouldn’t have any problems.
Having made these changes, I was ready to finally explore whether MJPEG works! I went ahead and decided to use the MJPEG streams from my two respective types of cameras in place of the static video links. These are:
Linksys WVC54GCA: http://YOURIPADDY/img/video.mjpeg
Axis 207mW: http://YOURIPADDY/axis-cgi/mjpg/video.cgi?resolution=640×480&clock=0&date=0&text=0
I also discovered (by reading the manual) that there’s a handy utility on the Axis config page (under Live Video Config -> HTML Examples -> Motion JPEG) which generates the proper URL based on a handy configuration tool where you can select size, compression, and other options:
The idle load on the system has increased, as expected, but that’s partly from me raising the FPS limit to 10 which seems reasonable, and enabling continual recording with motion detection (mocord).
I’m making a lot of tweaks as I get ready to transition everything onto a VM on a faster computer with much more disk space (on the order of 8 TB). If you’re interested in reading more about the Linux kernel shared memory settings, I found some good documentation:
Feb 1st
These past couple of days, I’ve finally gotten some time to work on the tremendous backlog of photos that I have sitting around from a number of trips. Among those pictures are sets of photos in the hundreds destined for photosynth. A number of my friends have expressed interest in what the software is, what it does, how it works, and how to take photos best suited for processing. I think now is a great opportunity to go over the basics.
First of all, what Photosynth does is create a 3D point cloud model/representation of an object or scene from a set of photos. Depending on the scene complexity, the number of photos might be in the tens, or hundreds for sufficiently complicated scenes. It all depends on the model and how much time you have on your hands.
Perhaps the best way to explain it, is to see it. The following is a synth of the Pantheon that I recently finished processing, constructed from photos taken by my brother and I from a D80 and D90:
The software uses feature extraction to identify textures in parts of each image that are similar, then tries to fit each corresponding from each image together to create a perspective-correct view. The process is extremely computationally intensive, but only needs to be done on the initial set of images to determine position and location. The beauty, of course, is that this process requires no human input for reconstructing the scene; it’s entirely computationally derived.
I won’t claim to be the most qualified to talk about it, but it does use feature extraction and some fancy fitting to work. An important note is that the software works based on unique features in texture, not necessarily on structure. This is why synths with lots of unique patterns turn out extremely well, while others don’t.
Creating the actual Synth is actually the easiest step; just create an account, install the software, add your photos, and go.
The real work in that process is creating proper tags, descriptions, and then adding geotagging data from photos, or later on in the web interface. Doing so is a great way to get your synth recognized.
There’s a great how-to on the official Photosynth website that goes over how to take pictures optimally, but I’d like to share some of my own.
If I’m taking a photo of a single object, something like this column, for example, I’ll try to stay equal distance away from the object, and take photos in steady progression around the subject.
The important thing to keep in mind is that although Photosynth can extrapolate the point cloud from features, it still cannot extrapolate images that you haven’t given it. Simply put, if you want to get the nice scrubber bar to circle around an object, you’ll need to take the requisite photos to make it. I find that pacing steadily around while taking photos at regular intervals is the best way.
Some of my favorite Photosynth creations are:
For equal comparison, here are some that didn’t turn out so well:
Jan 28th
Yesterday, just about everyone’s minds were on the iPad. Love it or hate it, what a ride that hype machine was, and what a launch too. But for me, my musings (or rather those of my roommate) were rudely interrupted by the loud boom of a car crashing through the retaining wall surrounding my house.
Apparently, an inebriated woman was proceeding northbound on Euclid in a white Infiniti when she struck a midsize black Mercedes SUV, and flew up, into, and through the cinder block wall surrounding my house. The force must have been pretty awesome, since the size of the hole is sizable. Definitely a lot of momentum (and resulting few meganewtons of force) went into that smash, since there’s shattered cinder block in my yard now.
They managed to destroy a lot of cactus, blocks, and the sign on the corner in the process. I’d like to point out the irony of an infinity with license plate ‘finiti’, crashing through my wall on Euclid (as in, the fabled “father of geometry”). ::shrug::
I know nothing about the occupants’ statuses or health, but hope they’re fairing ok. Lesson learned? Don’t drink and drive, kids.
Jan 26th
Tomorrow’s big unveiling will likely be focused around the much-hyped, illusory tablet (whose name nobody even knows), but a large part of its launch will be iPhone OS 4.0.
Much debate has taken place regarding whether the tablet will run OS X, iPhone OS, or something in-between. Thanks in large part to an errant comment by the McGraw Hill CEO on MSNBC, I think it’s safe to assume that iPhone OS was the right guess all along.
Or was it? It’s likely that instead of releasing the tablet running the 3.x line OS, apple will launch a 4.x fork to bridge the handheld iPhone/iPod Touch experience with that of the tablet, and in so doing bring the mobile OS closer to its desktop counterpart. It makes sense considering keeping two disparate app stores running could cause a colossal flustercuck.
Since its launch in March of 2009, OS 3.x has begun showing some signs of age, especially compared Android 2.x. Here’s a list of what I think OS 4.0 needs to really keep the platform competitive:
Although Google just launched an HTML5 version of the google voice interface (no doubt specifically targeted at the iPhone and WebOS platforms), it still pales in comparison with how seamless Google Voice integration is on Android. Users of that platform can completely transition to their new number.
Plus, let’s be honest, using a web based version is just hackety compared to being able to use a much more responsive app without having to jailbreak. Until the day comes, I’ll stick with using the still-banned GV Mobile.
This is the one that actually started the whole Google – Apple divorce, in case you have forgotten. It’d be amazing to finally see latitude integrated into the maps app the way it should have been the same month latitude launched.
Even better, the maps application (maintained by google) on BlackBerry OS and Android allows for seamless background position updating. As it is right now, iPhone OS users have to go to an HTML5-based version of the same application to update their position. Or jailbreak and use a solution like longitude (some screenshots/info here) and have it done on a schedule by a persistent background process. This is the solution I ultimately decided on
Perhaps this functionality isn’t allowed because of “duplication of functionality” with Mobile Me? Whatever.
Let’s just come out and say it, the mail app on the iPhone is extremely barebones. Coming from Windows Mobile, I was kind of shocked at how barebones, in fact. No ability to change font, underline, bold, italicize, or do anything regarding formatting. As it is right now, the best you can do is some copy paste.
ArsTechnica really did a good job highlighting a number of subtleties that I’ve noticed in their article here. The most annoying of which is that folders aren’t fully synchronized until you go into them. For example, opening a sent folder will cause all the sent emails to load chronologically. This can get frustrating if you’ve sent a lot and just want to look at one; instead, you’ll have to wait for all of them to load. I can do without a unified inbox or unified messaging app, because honestly I view that as a more of a nightmare to be avoided than a feature.
But those aren’t my main gripe, it’s that there isn’t a gmail app (like what Android has) that supports Labels, Stars, or any of the features that make Gmail integration with email clients over IMAP or Exchange difficult. It’s that whole decision they made to not use “folders” and instead use labels that drives me crazy, and to this day, I’m lucky if I can find any sent email in my google apps account.
Forget about background and push, just fix the email client.
Even though the platform has good customization for ringtones, the alert sounds for system events such as email and new SMSes are surprisingly limited. In fact, at first, I assumed I was “doing it wrong” and failing at finding the proper way to load them. Nope, turns out, what you have is what you have, and what you will have forever.
That default “Tri-tone” sound is what everyone uses, and it’s annoying as hell to have it go off in a crowded room and watch 8 people all go for their phones (myself included). Allow some variety, without the need to jailbreak.
A lot of the other platforms also have alert profile scheduling. Namely, you can specify whether you should be alerted audibly, with vibration, or not at all, on a time schedule throughout the day. I’ve defaulted to always leaving my phone on vibrate simply because this is missing.
This is probably everyone’s #1 wish for OS 4.0. Multitasking done right. Sure, you can jailbreak and do it, but it doesn’t lend itself to having a nice task-switcher. Instead, you’re left using what amounts to a task manager, which is completely the wrong way to do it.
Every other platform has it, only one platform (WebOS) has done it right so far. Can you, apple?
If you’re like me, you have 9 pages of applications that you’ve tediously organized. But sometimes, categories that are logical don’t come in sets of 16 (how many you can fit on one ‘page’). The real solution is to allow some sort of management. Be that folders, a menu, or something else.
Also, there’s no reason that people should be limited to 4 apps on the bottom row just for aesthetics when you have the room for 5. I couldn’t live without having 5 anymore.
Something that I think Android really executed properly was the centralized power management screen. HTC has added this to virtually every single device in recent memory as well. That feature is centralized management of radio hardware and other large current draws.
This is something that, if executed properly, could also be a selling point for making hardware “green.” Hell, as a potential EE, I’d be absolutely in love with a screen showing current consumption from all the chipsets in the hardware that report it, plots of use vs. time, and more intelligent prediction of how much life I’ll get out of the device with current use.
But on a more basic level, what we really need is a feature that allows users to schedule the hardware itself. Imagine you’re on a trip without your charger; odds are, you don’t need the radio hardware on while you’re sleeping, but you do need the device on so the alarm works. Allowing users to schedule power events lets you balance use ahead of time.
But a feature I think is really needed is a so-called “last legs” setting. Basically, after the battery has crossed a user-defined threshold (say 15-25%), the software automatically does everything it can to preserve battery life; WiFi is turned off, 3G is turned off in favor of EDGE, screen brightness is reduced to 20%, push services are put on hold, email fetch intervals are doubled or quadrupled, background processes are killed.
The hardware and software essentially would work together to squeeze every last minute of use out of the hardware when battery gets low. This is especially important for when you cross the threshold while the phone is in your pocket, when you probably don’t even know it’s dangerously close to death.
Historically, Apple delivers products that have extremely polished, working features. Essentially, they err on the side of only releasing features that work, always work, and work well, instead of releasing features that don’t always work, or lack polish.
That said, a lot of the market has caught up since 2009. It’s time to address all of those gripes, and I’m hoping OS 4.0 fills some of the glaring holes in the feature set tomorrow. We’ll find out soon.
Jan 24th
Something that’s bugged me for a long time is how crude and arbitrary signal bars on mobile phones are. With a few limited exceptions, virtually every phone has the exact same design: four or five bars in ascending order by height, which correspond roughly to the perceived signal strength of the radio stack.
Or does it? Let me just start by saying this is an absolutely horrible way to present a quality metric, and I’m shocked that years later it still is essentially the de-facto standard. Let me convince you.
Let’s start from the beginning. The signal bar analogy is a throwback to times when screens were expensive, physically small, monochromatic if not 8 shades of grey, and anything over 100×100 pixels was outrageously lavish. Displaying the actual RSSI (Received Signal Strength Indicator) number would’ve been difficult and confusing for consumers, varying between 8 already difficult to distinguish shades of grey would have been hard to distinguish, and making one bar breathe in size could have sacrificed too much screen real estate.
It made sense in that context to abstract the signal quality visualization into something that was both simple, and readable. Thus, the “bars” metaphor was born.
Since then, there have been few if any deviations away from that design. In fact, the only major departure thus far has been Nokia, which has steadfastly adhered to a visualization that makes sense:
Namely, their display metaphor is vertically ascending bars that mirror call quality/strength. This makes sense, because it’s an optimal balance between screen use and communicating the quality in an easy to understand fashion. Moreover, they have 8 levels of signal, 0-7 bars showing. Nokia should be applauded for largely adhering to this vertical format. (In fact, you could argue that the reason nobody has adopted a similar metaphor is because Nokia has patented it, but I haven’t searched around)
It’s 2010, and the granularity of the quality metric on most phones is arbitrarily limited to 4 or 5 levels at best.
Thus, an optimal design balances understandability with level of detail. On one hand, you could arguably simply display the RSSI in dB, or on the other hand sacrifice all information reporting and simply report something boolean, “Can Call” Yes/No.
Personally, I’m waiting for something that either leverages color (by sweeping through a variety of colors corresponding to signal strength) or utilizes every pixel of length for displaying the signal strength in a much more analogue way.
Green and red are obvious choices for color, given their nearly universal meaning for OK and OH NOES, respectively. Something that literally takes advantage of every pixel by breathing around instead of arbitrarily limiting itself to just 4 or 5 levels also wouldn’t be hard to understand.
Fundamentally, however, the bars still have completely arbitrary meaning. What constitutes maximum “bars” on one network and device has a totally different meaning on another device or carrier. Even worse, comparing the same visual indicator across devices on the same network can often be misleading. For example, the past few months I’ve made a habit of switching between the actual RSSI and the resulting visualization, and I’ve noticed that the iPhone seems to have a very optimistic reporting algorithm. No doubt, this is due much in part to the systematically-poor perception of AT&T’s network quality.
There’s an important distinction to be made between the way signal is reported for WCDMA versus GSM as well:
First off one needs to understand that WCDMA (3G) is not the same thing as GSM (2G) and the bars or even the signal strength can not be compared in the same way, you are not comparing apples to apples. The RSCP values or the signal strength in WCDMA is not the most important value when dealing to the quality of the call from a radio point of view, it’s actually the signal quality (or the parameter Ec/No) that needs also to be taken into account. Source
That said, the cutoff for 4 bars on WCDMA seems to be relatively low, around -100 dB or lower. 3 bars seems around -103 dB, 2 bars around -107 dB, and 1 bar anything there and below. Even then, I’ve noticed that the iPhone seems to run a weighted average, preferring to gradually decrease the report instead of allowing for sharp declines, as is most usually the case.
What you’re reading isn’t really dBm, dBmV, or anything really physical, but rather a quality metric that also happens to be reported in dB. For whatever reason, most people are averse to understanding dB, however, the most important thing to remember is that 3 dB corresponds to a factor of 2. Thus, a change of -3 dB means that your signal has halved in power/quality.
The notation dBm is refrrenced to 1 mW. Strictly speaking, to convert to dBm given a signal in mW:
Likewise, to convert a signal from dBm back to mW:
But even directly considering the received power strength or the quality metric from SNR isn’t the full picture.
In fact, most of the time, complaints that center around iPhones failing to make calls properly stem from overloaded signaling channels used to setup calls, or situations where even though the phone is in a completely acceptable signal area, the node is too overloaded. So, as an end user, you’re left without the quality metrics you need to completely judge whether you should or should not be able to make a data/voice transaction. Thus, the signal quality metric isn’t entirely a function of client-tower proximity, but rather node congestion.
Carriers have a lot to gain from making sure their users are properly informed about network conditions; both so they can make educated decisions about what to expect in their locale, as well as to properly diagnose what’s going on when the worst happens. Worse, perhaps, carriers have even more to gain from misreporting or misrepresenting signal as being better than reality. Arguably, the cutoffs I’ve seen on my iPhone 3GS are overly optimistic and compressed into ~13 dB. From my perspective, as soon as you’re below about -105 dB, connection quality is going to suffer on WCDMA, however, that shows up as a misleading 3-4 bars.
What we need is simple:
Jan 20th
I noticed a strange but rather interesting problem the other night while working on one of my WVC54GCA cameras which was accidentally reset to defaults. But before I explain, I should back up a bit.
I mentioned in my previous post about ZoneMinder that I had not completely finished reviewing or evaluating the new V1.1 firmware that Linksys made available just recently. I now can, with a bit more certainty.
In the Linksys Release Notes, it states:
Version v1.1.00 build 02, Jun 15, 2008
- Support of Setup Wizard is temporarily disabled to address security issue
- Fix security issues
- Fix Camera stability issues
- Fix VLC multicast playback issues
- Update TZO DDNS client
- Change Firmware version format
- Enable HNAP protocol support
- Fix OCX stability issues
- Update valid value range of RTP Data Port. New range is even value of 1024~65514.
Version v1.00R24, Jan 7, 2008
- Updated TZO DDNS client code to resolve an issue with incorrect TZO server address being used to resolve the customers FQDN.
Now, this all sounds fine and dandy, but I think Linksys has fixed some issues and broken some others at the same time.
Concerned about the security issues (which are supposedly fixed in 1.1), I upgraded both cameras from my previous trusty (but somewhat unstable) V1.0 R24 build. At the time, everything worked. The cameras continued functioning over WiFi, all of my previous settings were preserved, everything seemed fine. However, I noticed that the wireless configuration page shows something a bit odd: “undefined.” Take a look:
Well, that’s odd. But WiFi worked. For the record, I’m using a Linksys WRT54G-TM running Tomato 1.27 with WPA/WPA2 AES and an alphanumeric key over 20 digits long.
The other night, my roommate inadvertently reset one of the cameras to default running the new V1.1 firmware. When I brought it back over ethernet and configured it to connect to WiFi, I noticed the configuration screen had changed, subtly:
The fields had changed. In addition to making the password field now the proper type (showing only bullets or whatever character your browser uses), there are two drop downs for which WPA version you’re using, as opposed to one. That seems fine, until after about a half hour of trying, I couldn’t make the 1.1 camera connect to my wireless. Arrgh.
Much tinkering, taking the thing apart, making sure the wireless card was seated, e.t.c. later, I tried something else. I remembered how much of a hassle the previous release was and how flashing to some strange German version of the firmware and then back made wireless magically work.
Amazingly, Linksys doesn’t keep a nice repo around (or, at least I couldn’t find one, perhaps there’s an FTP dump somewhere). I found the old 1.00 R24 build here, and for backup’s sake, here’s a version I’ll host and keep around forever.
I flashed back to 1.00 R24, and immediately was able to connect. In addition, flashing back up to V1.1 after configuring wifi on 1.00 R24 makes everything work just fine. It’s a mess, but that’s the only way I can make it work.
I have a feeling that Linksys hasn’t completely nailed down the configuration settings/supplicant for that Ralink card they have in there, or that the open source drivers just aren’t that great. I strongly suspect this is partly the reason for the introduction of the WVC80N, although reviews do look promising. I’d like to get my hands on one of those!
Jan 16th
With the prevalence of eBook readers like the Nook, Kindle, Spring Design Alex and others, comes the necessity of building and maintaining a vast digital library. There are more resources online than one can easily list for both purchasing (and downloading) books in a suite of electronic formats, from PDF to DJVU, but what if you already own a book of the traditional dead-tree sort? What if you aren’t willing to purchase it again just for the convenience and ease of reading it on your brand new eBook reader?
Scanning becomes your only option.
I’ll be honest, the process isn’t easy, quick or glamorous. But it beats spending a day craning over your flatbed scanner or cutting the spine out of your expensive book to feed it through an equally expensive loose-leaf scanner (speaking of which, what the heck is up with how expensive they are?!). If the book is sufficiently expensive, it becomes an economical prospect quickly given the few hours required from start to finish.
I’m not going to address the legal/ethical/moral considerations. You could argue that making a PDF copy for yourself constitutes Fair Use, but the law being what it is, who the heck knows? Regardless, just exercise some moral introspection and decide for yourself.
The specific equipment I use is:
I’ve already mentioned Snapter twice, and although they’re commercial software (with a very generous 15 day free trial that gives you all the functionality of the real book), don’t let that fool you. I’ve had a lot of success with their software just because of how easy and functional it’s been in my experience. So much so that I went ahead and got the paid version.
That said, there are a few open source alternatives that do a pretty good job and are worth mentioning:
Scan Tailor is pretty good, has a nice GUI, and is very active. Unpaper doesn’t have a GUI but offers a lot for a command line tool. There’s always the advantage both OSS solutions offer that you can either code/propose functionality changes in the software itself with the active developers.
Another relevant article with tips is from /. , which posted ironically the week after I had already embarked on and discovered the ins and outs of scanning with a digital camera myself.
My setup is simple: I mount the camera on the monopod, stick it on the table, and balance it there with my trusty CRC handbook and some other heavy books.
You might be wondering why I didn’t just use a tripod. The reason is that it’s a much more challenging prospect to carefully both tilt the tripod and balance it so the camera is completely perpendicular to the book’s surface. For the best photo quality, one needs the book to be as close to coplanar with the camera sensor as possible. It makes sense, otherwise we’ll have a more challenging time getting the book totally in focus (depth of field will come into play), and have a harder time flattening the book in software.
I generally tape the black paper down to the floor, snap photos of the cover and back cover, and then tape those down as well. More on positioning later.
The whole thing looks like the following:
I have the flash set to bounce from the ceiling, just because in practice this yields the most readable photos. I also use all the light I can from the room itself.
A difficult consideration is that sometimes the print/copy itself has glare. This seems a lot more common with newer books than older ones; it’s almost like the print has a layer of varnish atop it. Just make sure you preview a few images and can actually read the copy.
Positioning the book is the tricky part; it’s difficult to balance between filling the frame with the book (so you have good resolution), and leaving enough space at the edges so that your software can do edge detection. Leave too little space around, and you’ll have a nightmarish time trying to field flatten. Leave too much, and you’ll be throwing away a ton of your image. Even worse, if you don’t tape the book down, it will gradually creep out of the frame.
Another big consideration is rotation. I’ve discovered that Snapter doesn’t really account that well for material that has even subtle rotation. You end up with slight skew in the resulting images. It isn’t a big problem, but rotation will immediately cause you headaches.
I usually go for something like this:
You could zoom in a bit more in this case if you wanted; in practice you’ll discover for yourself what works best.
I set the camera to use a relatively big F/# (in this case F/5.6) so there’s as much depth of field as possible. You want the whole book in focus.
This is the grueling part, capture images of every page. Snag a friend or something as having two people makes this process go much faster. One can turn the page and crease stubborn ones into place, and the other can trigger the shutter with the remote and make sure the book isn’t creeping out of the frame.
I find this can take anywhere between a half hour to much longer, depending on how much trouble the book gives you. The most challenging parts are the very beginning and the end. At these points, the pages have the most curve to them, sometimes sticking up. This is where sometimes creasing them down or using some tape on the stubborn ones can make or break your day.
Eventually, you’ll have a directory full of images somewhere you need processed.
At this point, you can use whatever tool suits your fancy, but if you’re using Snapter, read on.
Click Book, grab all your photos, and go make yourself a drink as you wait for it to do initial edge detection and processing on images. Nothing is being changed, it’s just generating the initial traces around the book it finds.
After this is done comes the only other bothersome part. It’s very worthwhile to manually go through each page and make sure you’re happy with the edge detection. Frequently, pages that have black or dark color at the edge cause headaches. Drag the handles around until they match closer. This can be grueling, but it’s important.
Click Input, change the background color to black (since we’re using a black piece of paper, or at least I did). Under Output, I also generally turn cropping each page off since I’d rather deal with a spread. Grayscale output will save on space later, and I keep the DPI the same since I’ll compress and downsample later in Acrobat. Now, you can click process and have yourself another drink.
After this is done, you can preview the results on the right. If everything is right, click Save and wait a little longer.
Now you should have a directory full of images waiting to be made into a PDF.
You can use whatever you’d like to make the PDF from the resulting JPEGs, however, I’ve had luck just using Acrobat.
Click Create -> Merge Files into a Single PDF, and then grab all those images you have.
Combine them, and you should now have a huge PDF. Save it, but you aren’t done yet. At this point, I generally take a look at the PDF Optimizer under the Advanced tab, and click Audit Space Usage. Yeah, it should be pretty huge.
If you absolutely need color, just skip this. If your book is black and white, converting is going to save you a ton of space.
To convert pages to grayscale, under Advanced click Print Production -> Convert Colors. Check “Convert Colors to Output Intent” and select “Gray Gamma 1.8.” I usually then exclude the front and back covers from the page range, unless you don’t care about that pretty color you’ll be missing out on.
This process also will take some time. Adobe is multithreaded, but still doesn’t use all my 8 logical cores on my i7 920. Just be patient.
After this finishes, you should now see a dramatic difference under the space audit report for Images. There might be a lot of document overhead, however. Don’t worry, this is normal.
At this point, it usually makes the most sense to do some OCR if you want, just to make the document searchable. Document -> OCR Text Recognition -> Recognize Text Using OCR does the trick.
Click Edit and select Searchable Image (Exact). This won’t resize your images or do compression; we’ll do that later. Now, wait a long time while it consumes CPU cycles and hopefully makes your document so much more powerful and useful.
After this finishes, you’re ready to do some compression and hopefully make your document small enough to not be an embarrassment, you storage hog, you. I usually downsample to around 300 DPI, leave monochromatic images alone (since we don’t have any), and opt for JPEG2000. Check everything in the Discard Objects, Discard User Data, and Clean Up tabs.
Click Ok, and now be prepared to wait the longest you have yet. Even on my rig, this takes an hour or two.
Check the space audit once more, and you should now have a reasonable sized, fully searchable, readable PDF, ready for your enjoyment.
Jan 10th
In recent months, home security and monitoring has become a matter of increasing concern across the country. Whether the reason is local downturn due to a spike in crime or just peace of mind, the price and difficulty of setting up an enterprise-level security system at home is lower than ever.
That said, the variety of hardware, open and closed source monitoring software, and configuration options makes it a bit daunting to jump right into. I’ve worked and experimented with a number of configurations and finally settled on one that I think works best (at least for my needs).
I originally started out with just one Linksys WVC54GCA. It’s a 640×480, wired/wireless 802.11b/g network camera with built in web server for stills and video, and some simple motion detection and alert functionality. The reason for its choice was simple; price. It’s Linksys’ primary network camera offering, and you can find it as of this writing for $89 at newegg. In addition, there’s a newer camera with 802.11n, the WVC80N.
However, it isn’t perfect. To quote the cons of my newegg review:
Cons: Wireless range isn’t excellent; I have a very powerful wireless AP with a 12 dBi omnidirectional antenna and a 6 dBi directional antenna, and I had to reposition it so the camera could send video back at a decent bitrate (around 2 megabits is where it sits).
An important thing to note is that the latest .24 firmware breaks WPA/WPA2 support. Mine shipped with .24 and I had to downgrade back to .22 for it to work. A bit disappointing, but hopefully future firmware will fix this glaring problem. The linksys forums have the link to a custom built .22 (oddly enough with german language selected by default, but don’t worry, all the menus are still english).
Motion detection isn’t perfect, sometimes false positives will get annoying. I have sensitivity set all the way down and still get a few random videos of nothing going on.
More recently, I discovered that the software (despite being open source and a *nix derivative) locks up after anywhere between 6-24 hours when the camera is connected wirelessly. This is fixable (in a haphazard sort of way) by calling an internal page that reboots the camera every 3 hours through a schedule in my Tomato router:
Thus far, this has proven a robust fix and makes the cameras entirely usable. I’ve notified Linksys and even had a chat online with a higher level tech that passed my findings on to a firmware engineer. They’ve recently released an update which purports to fix stability issues:
Version v1.1.00 build 02, Jun 15, 2008 - Support of Setup Wizard is temporarily disabled to address security issue - Fix security issues - Fix Camera stability issues
I have yet to fully test it. As an aside, the cameras are actually embedded x86 inside, sporting an AMD Geode SC1100 processor, 32? MB of SDRAM (2x TSOP marked PSC A2V28S40CTP), and Ralink 802.11b/g/a(?) chipset (RT2561T) as pictured.
Image quality is a little above average but nothing wonderful due to the relatively tiny plastic fixed focus lens system. Low light sensitivity is ok, but nothing stellar; you still need moonlight or ambient street lighting to get usable results at night. If you don’t mind those caveats, you’ve basically got the beginnings of a very robust (and cheap) network camera.
There are a number of relevant pages that are undocumented on the camera itself:
Reboot: http://USER:PASS@ADDRESS/adm/reboot.cgi
MJPEG stream: http://USER:PASS@ADDRESS/img/mjpeg.jpg
JPEG still: http://USER:PASS@ADDRESS/img/snapshot.cgi?size=3 (3- 640×480, 2- 320×240, 1- 160×120)
The options offered in the camera’s internal setup pages aren’t very robust, but offer just enough for you to do almost everything you’d want to.
After acquiring another Linksys camera for myself (and another 3 for the parents), trudging my way through the reboot issue, and reasonable but not stellar image quality, I decided I was ready for something more. Axis seems to have very good support, choice, and performance, in addition to heaps more customization and options for the camera itself. The catch? Price.
I decided to start off with Axis’ cheapest offering, the 207-series of network cameras. I managed to snag an Axis 207MW that had been used just once at a trade show that was being sold as used on eBay, and my dad went ahead and just purchased outright a 207W from Newegg. The distinction between the 207W and MW is that the 207MW has a 1.3 megapixel camera supporting resolutions of up to 1280×1024, whereas the 207W is just 640×480. They’re both 802.11b/g so you can move them throughout the house, and have almost identical setup and configuration pages. Of course, like the Linksys WVC54GCA, there’s optional ethernet support as well. Virtually all the other features are the same between the 207W and 207MW.
As of this writing, the 207MW is $328 at Amazon, and the 207W is $286 at Amazon.
Right off the bat, you can tell this camera is much different. It’s got an actual glass lens system, focusing ring, and a compact form factor with a longer cable. In addition, the antenna is external and swivels and snaps out so you can position it however suits getting the best signal. There are a variety of status LEDs on the back that make troubleshooting limited wireless connectivity simpler. The front clear ring is actually a large lightguide for four LEDs that can be either green or amber depending on the status of the camera. These can be disabled as well.
Image quality is also much better on the Axis 207MW than the Linksys. Originally, I had a WVC54GCA mounted where the Axis is now inside the garage. The Axis is both much more stable, and also gets better wireless reception outside in an otherwise difficult to reach trouble spot.
Among other things, the Axis offers many more configuration options within its internal administrative pages, as well as (if you’re interested in running it) many more options for built in motion detection. One of the more important things I’ve come across is the ability to change exposure prioritization so the otherwise very well lit driveway doesn’t come out a homogeneous white from pixels saturating as often. This kind of exposure prioritization can be done on the Axis, but not on the Linksys as shown:
There are just a wealth of options that really make the Axis shine over the cheaper Linksys if you delve deeper. I could write pages about the differences that the extra nearly $200 makes (if you can afford it). Both cameras offer the ability to upload images and 5-10 second video clips of motion detection events to an FTP share, or attach them to an email. Detailing the differences between the two (and the ultimate shortcomings of both) is another article in and of itself.
At the end of the day, I found motion detection somewhat unreliable on both the Axis and Linksys; either I wound up with far too many motion event video clips or nearly nothing. Even worse, downloading and then watching hundreds if not thousands of false positives a grueling task. If you’re a basic user or just interested in having a camera for temporary purposes while you’re away on a trip, perhaps just the in-camera features are enough. However, if you’re looking for something more robust for a number of permanent cameras with much better motion detection, keep reading. At the end of the day, I use both types of camera just as inputs for ZoneMinder as you’ll see later on.
ZoneMinder is a GPL’d, LAMP-based web tool for managing and monitoring virtually every kind of possible video source. Its supported sources span everything from cheap USB Logitech webcams, to network security cameras with built in webservers (like the two I’ve covered), to traditional video sources through a video capture card. Their documentation is a bit overly complicated (you can get to their supported hardware list here), and at the end of the day you’re going to either need to have local linux driver support (and a path to video like you’d expect for a webcam/TV tuner), or a path for JPEG, MJPEG, or another kind of MPEG4 stream.
The aim of ZoneMinder is to do all motion detection, video archiving, and image processing in one centralized place; simplifying use and making it easier to keep track of new events as they happen. Of course, the only downside to this is that all that motion detection and video capture requires a relatively powerful computer. Official documentation claims that even an ancient Pentium II should be able to do motion detection and capture for one camera at 25 FPS.
On the old computer I’ve configured (with a Pentium 4 Northwood 2.8 GHz and 2 GB of RAM), I’ve found that adding an 8 FPS VGA network camera and doing motion detection and capture adds between 10-20% CPU load.
Luckily ZoneMinder is relatively easy to setup if you’ve ever been near a modern linux distro with aptitude. As I noted earlier, ZoneMinder should ideally be run on a LAMP or similar web server, however, they claim that distro, web server, and SQL database support is actually quite diverse. I performed my installation on a fresh install of Ubuntu 9.10 Karmic by following instructions similar to Linux * Screw’s:
sudo apt-get install zoneminder apache2 php5-mysql libapache2-mod-php5 mysql-server ffmpeg
Once that was finished, the following:
sudo ln -s /etc/zm/apache.conf /etc/apache2/conf.d/zoneminder.conf
sudo/etc/init.d/apache2 force-reload
http://_YOUR WEBSERVER ADDRESS_/zm/
At this point, I’d encourage you to enable user authentication in Options -> System -> ZM_OPT_USE_AUTH. Ticking this box and saving will enable another tab, Users. I generally configure one admin for making changes and a less privileged “User” account for simply viewing the cameras and motion detection events as shown:
The remainder of options are largely fine in their defaults; the only major thing that you should be concerned with are the paths if you care about certain disks being used. I’ve recorded almost 800 events so far in VGA resolution and have used up an additional 1% of the meager 80 GB HDD on the system.
Now, to add some video sources. This is where you really have to either know the path to either JPEGs or a MJPEG stream on the camera.
Click “Add New Monitor” in the bottom right. Now, if you have an Axis or Panasonic network camera (arguably the two de-facto industry standards) there are some presets that are worth checking out that you can use by clicking “Presets” in the top right. Change the source type to Remote (if it isn’t already from using a preset), and name it appropriately. I also usually set the FPS to either 5 or 8; from what I’ve seen, higher really isn’t feasible.
Now click the “Source” tab. This is where if you used a preset, your life is much easier, as the remote host path is already filled in. If not, it’s still simple, you just have to know how to get the relevant data from your source. In my example, I have the following:
From the above screenshot:
Remote Host Name: user:pass@_PATH TO YOUR CAMERA_
Remote Host Port: 80 (Or something else, if you’ve changed it)
Remote Host Path: /axis-cgi/jpg/image.cgi?resolution=640×480
Note that this is for the Axis 207 series cameras, although in general all the Axis cameras follow the same syntax (nice, isn’t it?). You can’t use whatever resolution you want, however, all the major and obvious choices work. You’ll notice I’ve just used VGA instead of the full 1.3 MP 1280×1024 resolution image from the 207MW. This is because using full resolution does seem to generate too much network traffic for my 802.11g network (despite my best efforts, the garage remains a dead zone thanks to chicken wire stucco construction), and FPS takes a large hit. No doubt that if the connection were wired, higher would be feasible. However, VGA is more than enough for now.
Adding the Linksys sources are just as easy given the paths I outlined previously. I haven’t added any internal sources, personally, however I imagine that configuration is the same if not easier; it requires knowing the path and setting a few additional constraints so you don’t overload your server.
Now that you’ve added sources, you can configure their function.
You would think these options would be intuitive, however, they caused a bit of confusion for myself personally due to their shortened names. They are as follows, from the ZM documentation:
In practice, my cameras are generally set to Modect, unless I have an indoor camera with particularly high traffic, in which case Monitor makes more sense since all the motion detection events would be me moving around and about (take it from experience, you see yourself doing some pretty strange things). This is also a nice way of judging how much load each camera adds, as the setting is pretty immediate if you’re watching htop.
With time, you should now have a zoneminder console similar to mine:
I’ve noted in particular how offline hosts appear red, while online hosts appear green or orange depending on their function.
Perhaps the only last area of configuration are the zones themselves (finally, the zone in ZoneMinder!) These define the regions of interest, in each video source, that will be used for event detection. Clicking on “1″ under Zones will allow you to modify the default zone for the video source. This is where you’re really given a lot of control far beyond anything in-camera will ever offer from even Axis. You can add points and create polygons, as well as tweak sensitivity on an interface that looks like this:
Returning to the main “Console” view, the rest of the interface itself is relatively self explanatory.
If you’re interested in viewing all of the video sources currently enabled, clicking “Montage” should give a view similar to the following:
You’re also given the FPS below each camera, this is also handy. Lower resolution (eg I show a 320×240 source blacked out) cameras are somewhat intelligently tiled as well, using the available space pretty well.
Clicking on any of the events back on the console page should bring up something similar to the following, showing details about all the motion detection events from the given source (or all sources):
Clicking on any of the event IDs or names pulls up a window where you can review the event, from a few seconds before, to just after. You can also click anywhere below on the scrub bar to jump ahead or back, as expected. It isn’t perfect, but does a surprisingly good job:
Perhaps the coolest is “Timeline” view, a high-level plot of motion detection activity across all cameras overlaid on a timeline. This gives you an at-a-glance overview of whether the same events were being detected across all cameras at the same time, or to quickly pick out what time of day generates the most activity. In this view, mousing over times and activity as demarked by red on the plot refreshes the thumbnail appropriately, as well as with the detection region highlighted in red.
It isn’t always the most useful way to review events, but perhaps one of the more unique. I’ve found it useful for reviewing a few days or weeks at a glance when I’m gone. There’s also certainly a nice pattern that emerges over time, at least for me.
ZoneMinder supposedly has a nice mobile view available, however, I’ve had relatively little experience with it and had difficulty enabling it on my iPhone 3GS. Viewing the normal ZM site works fine, however, motion detection playback doesn’t work all the time.
In the meantime, I continue to use IP Vision for monitoring all my MJPEG sources:
If you’re interested in me detailing this, just let me know. Setup is again straightforward and merely involves knowing the correct paths and forwarding a few ports in your router. Also, it’s a great way to quickly consume tons of 3G bandwidth!
Setting up a robust, nearly commercial-level reliability home video surveillance system is now easier than ever thanks to the huge variety of video hardware and open source software available. I’ve moved from one single camera with in-camera motion detection sending alert emails with 5 second video clips to a gmail account (which quickly filled to the limit), to a secure and expandable motion detection suite monitoring at times 7 cameras that is accessible virtually anywhere I can get online.
If you’re only interested in home monitoring during a vacation or time away, setting up a system like this might not be the best solution, so long as you’re willing to sift through either an FTP dump full of videos and stills or a gmail account choc-full of videos. However, if you’re serious about having a manageable system with a number of fixed (or PTZ!) cameras that you need constantly monitored, ZoneMinder makes sense and gets the job done. In the future, for serious users, it can even be hosted commercially or simply store the event cache on a network share elsewhere to prevent physical tampering or theft.
Jan 7th
Like many others yesterday, I eagerly awaited the Microsoft CES keynote and the chance to see Steve Ballmer once again have a Developers Developers Developers moment on stage. Although it was initially marred by a power outage which delayed the conference some 20 minutes and damaged a Media Center TV and an ASUS eeeTV demo, what really made me pull the plug was what Microsoft did to the live stream itself.
Initially it was plagued with audio problems. The stream started too quiet, then suddenly lost the left channel, then the left channel came back but killed the right channel. At one point I’m certain there was some sort of loop in a volume normalization system, as gain increased continually for at least an entire minute. Of course, these issues are technical and completely understandable given the fact that nearly everything needed to be restarted after the power outage.
So imagine my disgust, and the disgust of others, when during the Microsoft Xbox 360 part of the keynote, the following comes up right as they prepare to show the Halo Reach trailer:
Absolutely incredible, censoring a live keynote because of IP concerns from the very company throwing the keynote. Even better, apparently the Xbox team wasn’t made aware that there was any problem at all with what was going to be shown:
Sorry that had to black that out….I did not know
t -Major Nelson
Even more strange, the content that was shown wasn’t new, in spite of the fact that the announcer lead-up to the video made it sound like it was going to be. It was nothing more than the Halo Reach trailer released over a month ago.
It’s a video…not a #haloreach demo. -Major Nelson
Why then did this content merit censoring the live stream for nearly 3 minutes? Is Microsoft not comfortable with using the public spectacle and attention that is CES to promote its own products and games? Is it honestly concerned that showing a trailer for a game in a live video stream constitutes some sort of breach in IP? What?
That, by itself wouldn’t be noteworthy, it was what followed that really iced the proverbial cake for the Keynote.
Yes. They did it again. If you’re so inclined, the video is here for everyone to view, now that we’ve been all made feel like children.
There is seriously so much wrong with doing something like this to the thousands of people watching the live stream that aren’t at CES but are still interested, that I don’t even know where to begin. In fact, I don’t even have to, because so much of that is obvious. But not, apparently, to Microsoft. Shortly after was when I stopped watching.
Nice of Microsoft to leave end-user-facing employees that work and try hard like Major Nelson to pick up all the pieces:
Reagarding[sic] the Reach blackout on the stream…..I am going to talk to some folks about that #notcool -Major Nelson
Ok, I need to take a walk and have a little chat with some folks. -Major Nelson
Imagine how shocked I was today, when during Paul Otellini’s Intel CES keynote the following popped up on the livecast:
I’m still not entirely certain whether, once again, the stream had been interrupted due to intellectual property concerns, DRM, or simply because they didn’t want to show more 3D parallax (despite having done so just minutes before).
Whatever the case, this seriously needs to stop.