Snd_pcm_delay failed error


#21

Looks like the error reporting is trying to convert the error code to string, assuming it’s an ‘errno’ error code (http://man7.org/linux/man-pages/man3/errno.3.html), in this case ‘Device or resource busy’ is equivalent to 16 (https://github.com/raspberrypi/linux/blob/rpi-4.14.y/include/uapi/asm-generic/errno-base.h#L20)

I agree with @mzero, it looks like the values returned by snd_pcm_delay are not actually error codes, but more like some time duration information, like milliseconds, or sample count, or something, so some place in the code is wrongly interpreting those values and printing these confusing messages. On one occasion, I saw a network error reported there too, which makes no sense. :slight_smile:


#22

The thing interpreting as errors (libasound2) and the thing printing them (Pd) are doing the right thing: the value returned by snd_pcm_delay is supposed to be an error code. But some driver or something is causing it to return, incorrectly, a count.


#23

I think we should try and close in on that place returning random values, most likely somewhere in the kernel code.


#24

FOUND!


tl;dr:

  • The values printed are indeed evidence of an underrun (xrun), and are the underrun size.

  • If this only happens during patch loading / unloading - you have nothing to worry about: Pure Data isn’t good about keeping the sound buffers running while doing loading, (I suspect esp. if there are libraries dynamically loaded)

  • If you hear glitches without seeing these messages: It’s the patch, not an underrun.

  • There was a bad kernel patch which messed up this call’s return value and confuses the libraries!

  • The fix has already been commited to the kernel, but it missed the latest Raspian kernel pull by a few weeks. We’ll see it soon enough.


Tonight’s digging:

  1. @miker2049’s discovery that the errors occurred when dsp was on while loading/unloading a patch lead me to try that - and indeed, I can see the errors, too. Good! I have a repro!

  2. Like @miker2049, I only see these on open and close. Otherwise, the patchs I tried all run fine. That said, I note that neither the ARP nor the Moog sound particularly “good” to me, but I don’t think it is xruns or too much processing. I also note that the Euclidian patch has some very transient clicks in it - but I think it is the patch!

  3. I traced more through libasound2 and found that there is the magic LIBASOUND_DEBUG=1 environment setting! So, launch puredata as follows:

    LIBASOUND_DEBUG=1 puredata

Then turn on the dsp in PureData GUI… load a patch… and, back in the terminal:

ALSA lib pcm_hw.c:559:(snd_pcm_hw_delay) SNDRV_PCM_IOCTL_DELAY failed (-4068): Unknown error 4068
ALSA lib pcm_hw.c:603:(snd_pcm_hw_prepare) SNDRV_PCM_IOCTL_PREPARE failed (-16): Device or resource busy
ALSA lib pcm_hw.c:559:(snd_pcm_hw_delay) SNDRV_PCM_IOCTL_DELAY failed (-4080): Unknown error 4080
ALSA lib pcm_hw.c:559:(snd_pcm_hw_delay) SNDRV_PCM_IOCTL_DELAY failed (-3618): Unknown error 3618
ALSA lib pcm_hw.c:603:(snd_pcm_hw_prepare) SNDRV_PCM_IOCTL_PREPARE failed (-16): Device or resource busy
ALSA lib pcm_hw.c:559:(snd_pcm_hw_delay) SNDRV_PCM_IOCTL_DELAY failed (-3630): Unknown error 3630
ALSA lib pcm_hw.c:559:(snd_pcm_hw_delay) SNDRV_PCM_IOCTL_DELAY failed (-3442): Unknown error 3442
ALSA lib pcm_hw.c:603:(snd_pcm_hw_prepare) SNDRV_PCM_IOCTL_PREPARE failed (-16): Device or resource busy

etc…

This means that the problem is at the other end of the ioctl call - in other words in the driver in the kernel. Now, I’m not sure really which stack of kernel modules get involved when playing sound out the Pisound - I did notice that Puredata has /dev/snd/controlC0 though ...C3, each open multiple times on multiple FDs. No idea if that is normal or related.

Kernel spelunking

I went looking… and found the source:

The bug was introduced in this kernel commit: ALSA: pcm: Remove set_fs() in PCM core code - on 2017-05-23. They were trying to optimize the call by having it return the delay, rather than store the value in user process space, returning an error code.

The bug was fixed in this kernel commit: ALSA: pcm: Return negative delays from SNDRV_PCM_IOCTL_DELAY - on 2018-05-01. Because someone in Chrome OS land finally noticed that the callers of that ioctl expected an error return, not the delay itself.

Alas, the fix came after the 4.14.34 kernel that is currently in Raspian. When they next bump the kernel, presumably we’ll get this fix. I don’t know why we didn’t see this before (or perhaps we all ignored it?) It might be because Raspian jumped to 4.14 relatively recently (last few months.)


#25

@mzero Great findings! Amazing work!

For every release RPi makes of its kernel, they update this file with the commit hash of their kernel repository: https://github.com/Hexxeh/rpi-firmware/commits/master/git_hash

Using these hashes and looking at the state of the code on kernel repo, I found that the last affected RPi kernel version was 4.14.37, and 4.14.39 includes the ‘ALSA: pcm: Return negative delays from SNDRV_PCM_IOCTL_DELAY.’ fix.

Even though the Raspbian images at the moment get shipped with older, affected kernel, this can easily be fixed by running:

sudo rpi-update

to get the latest kernel version.


#26

!!!
“sudo rpi-update” has definitely stopped the errors! A few still initial tests and everything is working nicely, but I will keep at it. One thing is that the “shore” patch I linked to above still doesn’t really work very well. Very distorted sound and the red message for “audio I/O error” next to the dsp switch in the terminal.

But that is probably a separate issue and not a big deal. I will try doing some bigger automatonism patches later, but all the blokas curated pure data patches were working for me without a hitch as far I could tell and certainly no errors anymore. As far as I am concerned this issue has been tentatively resolved on my end! Thanks for your efforts @mzero and @Giedrius


#27

Ok, good to know. I have increased the recommended kernel version in install-pisound.sh, if it detects kernel version being lower than 4.14.39, it will recommend upgrading using sudo rpi-update.


#28

I’m a little hesitant to recommend rpi-update because it updates from a kernel build every few days. It is pretty “bleeding edge”.

rpi-update supports a different brance from the same repo via the BRANCH env. variable. There is a stable branch which updates less frequently. Alas, it isn’t clear what the update criterion is, and the relationship to master isn’t clear.


#29

The master and linux_stable branches on RPi Linux repository seem to be pretty old ones, they’re not using them for their releases.

I’ve asked them some time ago about their 4.9 and 4.14 branches, they said that they treat 4.9 as LTS branch, and 4.14 as the mainstream one. However, the last release of 4.9 branch was 4.9.80, released on February 9th, and it does not have any support for RPi 3B+, downgrading to it while running on 3B+ will make the system non-bootable. :slight_smile:

They do have ‘raspberrypi-kernel’ apt-get package, but I don’t know how often it gets updated.

In case an issue occurs with some latest kernel release, it’s possible to downgrade back to some specific version, by providing a commit hash from https://github.com/Hexxeh/rpi-firmware/commits/master to rpi-update as an argument, or reverting back to ‘stock’ kernel using sudo apt-get install --reinstall raspberrypi-bootloader raspberrypi-kernel, as described here.

So while we wait for RPi to update their ‘stock’ kernel, I think rpi-update should be good enough, and eventually the ‘recommended kernel update’ message in install-pisound.sh will be superseded by the stock kernel having higher version in future Raspbian OS images.


#30

Thanks a lot @mzero! Great work digging into the code!


#31

Also confirming that I no longer get the snd_pcm_delay failed error since updating the kernel.