Wednesday, February 28, 2007

Rebuilding Chef

Chef is old and needs a rebuild, and it's raid volume has issues. I chose to rebuild Chef with Debian stable. Chef is a server and as such I don't want much updates for it. Since it's no longer doubling as my firewall, I'm more comfortable with this as well. I trust Linux software raid more than RaidFrame.

tasks:
postfix - DONE
spamassassin (spamd.conf) - DONE, looks good
apache2 (tested, data transfered) - DONE
dovecot (installed, configured) - DONE
apcupsd (not started)
mrtg (installed, unconfigured)
nagios (not started)
snmpd (installe, configured, running) - DONE
smbd/nmbd - DONE
nfs exports - DONE
raid (one half of mirror installed until I migrate mythtv to 500GB)
webalizer (not started)
sensors (started, missing drivers)

Saturday, February 24, 2007

Bad jpegs

In the last few months when we browse photos on babybaer which mounts a volume from chef over NFS, we occasionally get odd effects that appear to be errors in the jpeg encoding, as if a small piece of data is missing, resulting in a color offset in a part of the image and slightly offset image after the error. Like this:



chef has been logging an error "nfsd send error 55" for the longest time, with no apparent ill effects. Is this maybe related and we only noticed it now? Are things getting worse?

A Google search revealed no answers to where the error on chef is coming from in the first place. It means out of buffer space, but I couldn't find any hints *why* that might be happening.
All I could find are complaints from other people that run Linux clients against OpenBSD servers, about that error filling up the logs. It also appears to happen on NetBSD, which would make sense, given how much code is shared between NetBSD and OpenBSD. Increasing NMBCLUSTERS might help a bit, but no definitive solution. Turning off CBQ might help, but I don't run CBQ (or any firewalling) on chef anymore. Someone suggested bad interaction with cheapo RealTek cards (hey, look at that chef is using a RealTek card). I switched to another interface (dc0), let's see if it makes a difference. Of course, I have no way of knowing for sure any time soon, since the error happens only when browsing 100's of jpegs over NFS.

Update:
I took a look at the timestamps of the nfsd error 55 messages and noticed they don't line up with when we see bad images. Today there was only one error logged this morning, but Patricia got a couple bad images this afternoon. Lovely. This doesn't help.

Update:
At this time I'm not even sure how to quantify the problem. Let's see if this is an issue with Linux NFS to OpenBSD, or something else. Let's generate a md5sum list of a large body of data (my jpegs) straight from disk on chef and via NFS on babybaer.

Update:
This is really easy in Linux:
find . | grep -i jpg$ | xargs -iD md5sum "D" > md5-linux

And really weird in OpenBSD (mostly because the output of md5 is so convoluted):
find . | grep -i jpg$ | xargs -iD md5 "D" | awk -F '[)(=]' '{print $4 $3 $2}' > md5-openbsd

awk seems to be doing some fancy buffering (?) when piping output to another program (like sed) or just a file. I just don't get any output whatsoever. Hmmm.

...

oh great, while experimenting with that, chef decided to say Goodbye and hung itself. No reaction on console. Just hangs. That sucks.

Ctrl-Alt-Esc doesn't work. But Shift-PgUp/PgDn does. Surprise. The box is not on the network, not echoing characters. *sigh* Looks like another power cycle. Bah.

Update:
Rebooted chef last night. I reproduced the jpeg problem again this afternoon on two images, so it's not a RealTek nic driver issue. There are no logs on chef indicating an error, nor on babybaer. It looks like random data corruption, but that doesn't make sense. ... Wait. There is a smart raw read error on chef, followed by ecc_hardware corrected. Could this be related? Rebooting Babybaer to rule out impact of local file system cache.

Update:
Nope. The root of the problem is on chef. I reproduced bad jpeg copies by simply cp'ing a directory of jpegs and running md5sum on the files. One recent directory (02-25) seems to be particularly suspectible to this problem. Of 8 files I copied in several attempts up to 7 had different md5sums. I hexdumped a few files and the differences are always on 4kbyte page boundaries (4kByte, 12kByte).
That seems to indicate either a file system problem (but the filesystem checks clean), or a RAIDframe volume problem, maybe due to issues with the underlying disk sectors (possible), a memory problem (but then I would expect to see more random failures), or a CPU problem (ditto).
I'm still uncomfortable with the raw read error/ecc corrected counters reported by smartd.

Friday, February 23, 2007

Hillary Clinton

So, I just attended an event with Hillary Clinton. Interesting. When I walked over to the building, the motorcade was already parked outside. Police, large SUVs, quiet men in black jackets, just like in the movies. When I got inside it was *packed*. I actually got lucky and managed to get in before they closed the the doors due to overcrowding. The announcement of one of the VC technicians to keep the aisles clear and there is overflow room in one of the adjoining tech talk areas didn't convince many people to leave. We just squeezed togehther a bit more...
There were easily over a thousand people in the room.

I didn't expect Clinton to have the views she presented today on the environment and energy. As with so many polititians she's extremely convincing when in the same room and on stage. She doesn't sound fake, rather genuinely interested in making things happen.

Changing the energy policy to promote clean, green technologies, pulling all levels of government together, from federal to state down to the local level to move in the same direction: 10 years from now, she wants the country be well on the way to energy independence from fossil fuels without wrecking the economy along the way. Interestingly, she used California as the prime example what to do. While the per-capita use of energy has increased by 50% in other parts of the US, it has stayed flat in California. She attributes that to California's insistence on sticking with their state-wide incentive programs and pollution limits when the Reagan administration was dismantling whatever such programs existed on the federal level in the early 80's.

She told a nice story about how she vividly remembers after the Russians shot Sputnik into space, all Americans pulled together to create something amazing, the space program, which grew into DARPA, one of the sponsors of the Internet. Her teacher told the students, "kids the president wants you to study math and sience". She believed at the time Eisenhower actually called her teacher to tell them that. Kinda cute story. She used that as a starting point to talk about how she wants Americans to pull together and rid America from its dependence on fossil fuels, move to clean, green energy, bio-fuels, and fight global warming. She's crossing her fingers for Al Gore that he wins the Oscar for "An Inconvenient Truth". This was a great section of the talk. She came accross very strong and believable.


When asked about the mess in Iraq, she called it "the height of irresponsibility" that President Bush might leave the Iraq war to his successor to sort out. "This is his war. He needs to clean it up. But if he continues to go the way he's going, I don't expect him to." She strongly advocates pulling troops out of the sectarian civil war that is now engulfing the country, and setting up a regional conference involving all neighbors, including Syria and Iran, to make sure they do their part to stabilize the country. She has a very active role in the senate to reign in Bush's war and "force him to do the right thing. To at least send in troops with all the supplies and material they need to be able to fight this war".
One interesting remark was about how silly it is not to talk to your enemies. Throughout the cold war, both republican and democratic presidents talked to the Russians, to learn, to get intelligence and to understand. "you don't make peace with your friends, you don't have to. In order to make peace you have to talk to the people with different views, with ideas that are the exact opposite of yours."

By the way, it will be interesting to see how she's going to pull people to her side who claim she is unelectable next year. She went through pains to paint herself as an inclusive person, that brings people together, using her home state New York (with a blue New York city, and a very red upstate New York) as example. I'd be curious to know how much of the 68% of New Yorkers that reelecred her last year, actually live in upstate New York.

Clinton made several remarks about universal health-care and how electronic record keeping can lower the costs substantially. I'm a little ambivalent about this, seeing how easy it is to abuse that kind of data and how often backup tapes get lost, laptops stolen, and knowing how many providers and companies allow confidential employee or patient data to travel around on standard laptops in the open. Extending electronic record keeping to local doctors who definitely don't have the resources locally to do a job right, that even banks screw up occasionally, makes me shiver. Her presentation on this topic was a bit ... thin.
The general idea of moving away from the company sponsored health plan system, towards something that many european countries have already developed, sounds like a good idea to me, though.

Finally, she was talking about how important education is. How education starts with the parents. In the family. Then extends out to schools. I'm a big believer in getting parents involved in school. If you know what's going on and are aware, it's much easier to help your children to be successful. Clinton's specific ideas on this topic (while it appears close to her heart) didn't excite me, though. "Make school hours more flexible, because they are inconvenient for working parents. Give kids online access to advanced courses, potentially not offered at their school."
Eh, I don't know, call me old-fashioned, but I think bringing excellent teachers into schools, recognize what they do every day, and paying them a decent salary would go a long way, along with giving them the tools and means to be the most effective. This won't happen overnight even in the most favorable environments, but you have to start somewhere.

Overall, a high-profile politician with charisma, and some decent ideas. Seeing Barack Obama next would be great. Maybe they pull this off some time this year.

Saturday, February 10, 2007

more grumpy upgrade notes

issues:
- kernel 2.6.18-3 doesn't boot (neither 686 nor k7 variety) - FIXED
- ivtv not installed, no firmware - FIXED
- database schema upgrade fails - FIXED
- playback works, but no sound - FIXED
- mythvideo doesn't work (no mplayer?) - FIXED
- no scheduled recordings show up, but they are in the database - FIXED
- no automatic login for mythtv - FIXED
- lirc/remote control doesn't work (missing kernel modules)
- wireless doesn't work
- mythgames no games
- no nvidia-settings - FIXED
- no ntpd - FIXED
- no mouse in X
- no mrtg set up
- most myth recordings fail
- no sound in live tv and recordings



kernel issues:
2.6.18-3 only boots with acpi=off on grumpy. worked fine in 2.6.15. wth?


installing ivtv:
use 0.8.2, follow Debian instructions at http://ivtvdriver.org. Use pvr_1.18.21.22254_inf.zip from http://ivtv.writeme.ch/tiki-index.php?page=FirmwareVersions.
Later downloaded pvr_1.18.21.22302_inf.zip and tried that, in order to check out audio problems.


importing old database:
can't update schema version beyond 1156 because of duplicate column cmd_repeat in tabl diseqc_tree. alter table diseqc_tree drop column cmd_repeat
restart mythtv-backend.
FIXED


nvidia-settings:
apt-get install nvidia-settings
but it doesn't seem to do anything useful.


ntpd:
apt-get install ntp-server
FIXED


mythvideo:
apt-get install mplayer win32codecs
FIXED


sound:
plenty of comments in howto, nothing seems to help. probably some silly interaction somewhere.
go back to basics:
- turn off artsd in KDE, turn off all KDE sounds
- test with aplay - works fine
- test with mplayer - works
- mythvideo with mplayer - sound works
- myth playback still has no sound
- after a reboot, audio works properly in myth playback as well. huh, odd.
FIXED


scheduled recordings:
- after upgrade no channel lineups were bound to /dev/video0 and /dev/video1. ran mythtv-setup paged through all pages, corrected as needed. upcoming recordings is populated now, and also shows properly in mythwelcome.
FIXED

lirc:
- complains about missing kernel modules
- module-assistant complains the lirc-modules-source is not properly installed, but dpkg -l lirc-modules-source shows all is good, can't find any lirc modules in /lib/modules tree
still unresolved


automatic login:
I'm using gdm. gdmsetup doesn't get me to log in the mythtv user automatically. Annoying. Here's the config snippet in /etc/X11/gdm/gdm.conf that makes it all work properly:


[daemon]
AutomaticLoginEnable=true
AutomaticLogin=mythtv

TimedLoginEnable=true
TimedLogin=mythtv
TimedLoginDelay=5
LocalNoPasswordUsers=mythtv


auto-login FIXED.


wireless:
rebuild ndiswrapper with module-assistant, add configs to /etc/network/interfaces.
manual verification using iwconfig. ifconfig wlan0 up. dhclient wlan0.

interface comes up, but throughput is unbearably slow.


myth recordings fail:
trying to record random shows fails with

2007-02-10 21:33:47.805 Reschedule requested for id 110.
2007-02-10 21:33:51.259 Scheduled 310 items in 3.5 = 0.10 match + 3.35 place
2007-02-10 21:33:51.287 scheduler: Scheduled items: Scheduled 310 items in 3.5 = 0.10 match + 3.35 place
2007-02-10 21:33:51.431 TVRec(2): Changing from None to RecordingOnly
2007-02-10 21:33:51.449 TVRec(2): HW Tuner: 2->2
2007-02-10 21:33:51.525 Channel(/dev/video1): SetInputAndFormat() failed
2007-02-10 21:33:51.526 TVRec(2) Error: Failed to set channel to 3. Reverting to kState_None
2007-02-10 21:33:51.527 TVRec(2): Changing from RecordingOnly to None
2007-02-10 21:33:51.554 Canceled recording (Recorder Failed): Law & Order: Special Victims Unit "Burned": channel 1002 on cardid 2, sourceid 1
2007-02-10 21:33:51.580 scheduler: Canceled recording (Recorder Failed): Law & Order: Special Victims Unit "Burned": channel 1002 on cardid 2, sourceid 1


some random shows are actually recording. *sigh*

[3 hours later]

ok. I give up. I can't solve the following right now:
- jittery video recording (suspect: ivtv 0.8)
- no sound when recording (suspect: ivtv? mixer?)
- intermittent inability to change channels for recording resulting in failed recordings (suspect: mythtv/ivtv api v2 interaction)

Looks like 2.6.18 with ivtv 0.8.2 is just not stable enough, but closed captioning works...
2.6.16/2.6.17 are actively discouraged since there is ongoing integration with ivtv at the time. Which brings me back to ivtv 0.4 and 2.6.15 and I'm running out of time.

To make matters worse, the Seagate replacement disk I bought, threw DMA errors multiple times tonight. I replaced the 300GB Maxtor disk back into Grumpy. At least I thoroughly removed dust from the case.

Tuesday, February 06, 2007

Bootstrapping Grumpy

The 300GB Maxtor disk in Grumpy is giving me grief. Has trouble booting, very slow performance. Most certainly a case for the warranty department.
I'm concerned Grumpy won't boot anymore when I cycle power, and I certainly don't feel like loosing my MythTV recordings...

I'm re-building Grumpy from the ground up on a new harddrive. I hope my notes from the last year in this blog will be useful.

I started out with the Debian netboot install. Sarge netboot.tar.gz unpacked into chef:/tftpboot, add new host entry in gw:/etc/dhcpd.conf with next-server 192.168.200.10. Works like a charm. In order to get a picture after booting on this VIA M10000 board, I need to pass the vga option to the kernel.

I want to leave the option open of adding another hard drive later and build a RAID1 from the video and movie files. In the Debian installer I manually reconfigured the partitions to use (ext3, swap, RAID), then configured RAID1 with only one member partition. I'm not using LVM here, this is a very straightforward setup. The large partition will be used for video and movie files, the root partition will hold everything else.

And again, I'm impressed by the ease of this network-based installation. Debian rocks.

First thing after the installation and completing base-config, I switch APT's sources to testing, so that I get a recent kernel (in this case I'm going to 2.6.18-3).

surprise: A base Debian install has no ssh...

On to rescuing media files... oha, I forgot how much data 300GB is. This will take a while.

Saturday, February 03, 2007

Automatic wakeup for Grumpy

This had me puzzled for the longest time (see my musings from last summer).

Today I decided not only to finally run some backups of Chef, but also to make Grumpy (my MythTV box) shut down after it's done recording. I stumbled over mythwelcome a very slick new feature in MythTV 0.19. It's a welcome screen for mythfrontend (similar to what I wanted to write), and additionally has a few additional features (like automatically turn on the system during some specified period, e.g. in the evening when I'm most likely to watch TV).

A few months ago I collected all the necessary data for nvram-wakeup to set the NVRAM alaram in the BIOS. I got mythwelcome configured just fine following the instructions in the Wiki. It seems to work, but I still don't like messing with the nvram directly. Finally, I saw some notes in the ACPI_Wakeup section of the Wiki about how "some finicky BIOSes reset the RTC alarm when the time gets written". Guess what, every Linux distribution syncs the system time to the hardware clock on shutdown. I have a finicky BIOS!

As susggested in the Wiki, I modified /etc/init.d/hwclock.sh to on shutdown read out /proc/acpi/alarm, save the system time, then write back the alarm time. Lo and behold it worked on the first try.

Here are my final configuration settings for this:

I'm using mythwelcome and have it configured to call /usr/local/bin/myth-set-acpi-wakeup in the nvram-wakeup command line:


#!/bin/bash

# set wakeup time via ACPI alarm
#

logger "wakeup: got command line $0 $1 $2 $3 $4 $5 $6"

if [ -z "$2" ]; then
echo "usage: $0 --settime "
echo "must be in seconds since epoch"
logger "no wakeup time given"
exit 1
fi

# This is a crazy hack
# The problem is that mythwelcome is passing us the timestamp
# in localtime not adjusted for UTC, but date is using the
# base of UTC, so we need to adjust back to PST.
# TODO(bbeck): Daylight savings time is probably going to bite me on this, I should set
# the BIOS clock to UTC.
seconds=$2
utctime=`date -d "1970-01-01 $seconds sec" +"%Y-%m-%d %T"`
time=`date -d "$utctime -0000" +"%Y-%m-%d %T"`
# end of hack

logger "setting wakeup time to $time."

echo "$time" > /proc/acpi/alarm

result=`cat /proc/acpi/alarm`

if [ "$time" == "$result" ]; then
logger "wakeup time set properly"
else
logger "wakeup time not set (got $result)"
fi



The nvram-wakeup command is set to myth-grub-poweroff:

#!/bin/sh

echo "savedefault --default=4 --once" | grub


which triggers grub config 4 in my menu.lst on the next boot:

title PowerOff
savedefault --default=0
cat /boot/grub/default
halt



reboot and poweroff commands are set to

/sbin/reboot and /sbin/poweroff, respectively

Shutdown/Wakeup options in mythtv-setup are configured as follows:

Block shutdown before client is connected is checked.

Idle timeout: 60s

Max. wait for recording: 30m (I'm recording several shows that are 30 minutes apart. No point to shut down in between)

Startup before rec: 180s

Wakeup time format: yyyy-MM-ddThh:mm:ss

Set wakeuptime command: mythshutdown --setwakeup $time

Server halt command: sudo mythshutdown --shutdown

Pre-shutdown check command: mythshutdown --check