Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong total number of samples after encoding DSD(.dsf) file #71

Closed
Borewit opened this issue Apr 14, 2019 · 9 comments
Closed

Wrong total number of samples after encoding DSD(.dsf) file #71

Borewit opened this issue Apr 14, 2019 · 9 comments

Comments

@Borewit
Copy link

Borewit commented Apr 14, 2019

Hi David. I am enhancing the DSD support in music-metadata.

I converted one of my .dff sample file into a WavPack file using wavpack.exe (wavpack-5.1.0-x64).

I noticed that the total_samples of the WavPack-block-header contains the sample length 8 times lower then what it should be (in bytes?).

Related issues:

@Borewit
Copy link
Author

Borewit commented Apr 14, 2019

Sample files:
#201.zip

Converted with:

c:\utils\wavpack-5.1.0-x64\wavpack.exe -h test\samples\originals\issue\#201\2L-110_stereo
-5644k-1b_04_0.1-sec.dsf

Expected sample length: 564480 samples
Found: 70560 samples

@dbry
Copy link
Owner

dbry commented Apr 15, 2019

First of all thanks for adding WavPack DSD support to music-metadata...very cool!

Yeah, it's a little confusing, but DSD files look like they actually have 8-bit samples, even though they're obviously 1-bit samples. I would suggest that you take a look at the WavPack 5 Porting Guide, especially third paragraph of section 7.0 for an explanation of this. I realize that you aren't using the WavPack library APIs from Java, but this information is still probably useful because it also applies to the actual WavPack header. Specifically there was a change made to handle files with over 2^32 samples by using a previously unused byte in the header. I did this in a way that's completely compatible as long as the file isn't super big, but your code will fail if a big one comes along.

So basically you'll have to multiply the number of samples by 8 if the DSD flag is set in the header. A little more complicated is the sample rate. The actual DSD sample rate is the regular sample rate stored in the flags (which I'm sure you already use) times 8 (because it's bytes, not bits) times another multiplier which is contained in the metadata block for the DSD data. You could assume this is 8 because that's the rate of the vast majority of DSD files (all SACDs), but eventually you might run into a higher rate for audiophiles.

Thanks again!

@Borewit
Copy link
Author

Borewit commented Apr 15, 2019

Thanks a lot for your reply David!

Yeah, it's a little confusing, but DSD files look like they actually have 8-bit samples, even though they're obviously 1-bit samples. I would suggest that you take a look at the WavPack 5 Porting Guide, especially third paragraph of section 7.0 for an explanation of this.

Okay, thanks, I understand it now.

I used the file format description indeed. Would be great if small description for DSD could be added to 2.0 Block Header.

Is the second factor stored in the ID_ALT_TRAILER metadata-sub-block or really inside the DSD data?

Specifically there was a change made to handle files with over 2^32 samples by using a previously unused byte in the header. I did this in a way that's completely compatible as long as the file isn't super big, but your code will fail if a big one comes along.

Thanks for the warning, I already got it from the documentation, so you did a great job there.

From the WavPack sample I attached earlier, foobar2000 seems to be able to extract metadata:

Artist Name :  CANTUS (Tove Ramlo-Ystad) & Frode Fjellheim
Track Title :  Kyrie
Album Title :  SPES
Date :         2015
Genre :        Choral
Composer :     Frode Fjellheim
Performer : 
Album Artist : CANTUS (Tove Ramlo-Ystad) & Frode Fjellheim
Track Number : 4
Total Tracks : 12
Disc Number :  1
Total Discs : 
Comment :      Generated by Merging Technologies Album Publishing (...)
<ENCODED BY> : Merging Technologies Album Publishing
<ISRC> :       NOMPP1501040
<PUBLISHER> :  2L

It a bit confused here, there is no APEv2 tag header set.
Do you have any idea where it got that information from David?

Borewit added a commit to Borewit/music-metadata that referenced this issue Apr 15, 2019
@dbry
Copy link
Owner

dbry commented Apr 16, 2019

That multiplier is the first byte of the ID_DSD_BLOCK metadata sub-block. The normal value is 2 so the rate is multiplied by 8 (1 << 2).

As for the compressed .dsf file, I think I know what's going on there. As you know, .dsf files have an optional ID3v2 tag in the "trailer" (which in WavPack is simply everything past the audio data). When you compress the file that trailer gets stored in the ID_ALT_TRAILER medadata item so that the original file can be completely restored. I can see this in your file with the wvunpack -ss command which shows file wrapper: 92 + 287166 bytes (DSD , ID3?). Apparently Foobar2000 looks at that when reading the file.

If you compress using the --import-id3 option in WavPack, then those ID3 tags are also imported into standard APEv2 tags. This means unfortunately that they're taking twice as much space, but it's the only way to have the ID3v2 tags for restoring the file and the APEv2 tags that are standard for WavPack.

Thanks again!

@dbry
Copy link
Owner

dbry commented Apr 16, 2019

And yes, I will update the file format description document a little more for DSD...thanks for letting me know about that.

@Borewit
Copy link
Author

Borewit commented Apr 16, 2019

No thank you David.

I detect the following metadata sub-blocks: ID_ALT_TRAILER & ID_BLOCK_CHECKSUM
I cannot find the ID_DSD_BLOCK.
It looks like the sub block size nicely sum up with the total meta-data block size, which gave me some confidence I parsed it the right way. Should it really be present in that file David?

Regarding the documentation (which got my attention in an attempt to fix the above). This parts of the documentation confused me as well (caught me twice in fact, due to my bad memory):

// 0x1f metadata function id

So 5 bits. .But the following is actually included in ID enumerations:

// 0x20 decoder needn't understand metadata

confirmed by:

// ids from here are “optional” so decoders should skip them if they don't understand them

So the optional flag is clearly part of the enum, while the large large block flag is not. To make this more explicit, I suggest to describe the function_id with a 0x3f mask (6-bits). In addition to that, you can still descrive 0x20 of the function_id describes the optional character (or greater than 0x1F).

I will look into the decoding the compressed original ID3v2 header later. It really puzzled me where it came from.

The boys and girls of Foobar2000 did a great job apparently! ;-)

@dbry
Copy link
Owner

dbry commented Apr 16, 2019

There's a simple command-line program in the cli directory that parses WavPack files showing all the blocks and breaking down the metadata blocks. Using that on the file you sent me shows this:

 WVPARSER  WavPack Audio File Parser Test Filter  Version 1.00
 Copyright (c) 1998 - 2019 David Bryant.  All Rights Reserved.


stereo audio block, version 0x410, 22050 samples in 19272 bytes, time = 0.00-0.25
samples are 8 bits in 1 bytes, shifted 0 bits, sample rate = 88200
flags: INITIAL DSD CHECKSUM FINAL
  metadata: ID = 0x28 (ALT_EXTENSION), size = 3 bytes
  metadata: ID = 0x23 (ALT_HEADER), size = 92 bytes
  metadata: ID = 0x25 (CONFIG_BLOCK), size = 3 bytes
  metadata: ID = 0x2a (NEW_CONFIG), size = 2 bytes
  metadata: ID = 0x0e (DSD_BLOCK), size = 19119 bytes
  metadata: ID = 0x2f (BLOCK_CHECKSUM), size = 4 bytes

stereo audio block, version 0x410, 22050 samples in 18722 bytes, time = 0.25-0.50
samples are 8 bits in 1 bytes, shifted 0 bits, sample rate = 88200
flags: INITIAL DSD CHECKSUM FINAL
  metadata: ID = 0x2a (NEW_CONFIG), size = 2 bytes
  metadata: ID = 0x0e (DSD_BLOCK), size = 18676 bytes
  metadata: ID = 0x2f (BLOCK_CHECKSUM), size = 4 bytes

stereo audio block, version 0x410, 13230 samples in 11034 bytes, time = 0.50-0.65
samples are 8 bits in 1 bytes, shifted 0 bits, sample rate = 88200
flags: INITIAL DSD CHECKSUM FINAL
  metadata: ID = 0x2a (NEW_CONFIG), size = 2 bytes
  metadata: ID = 0x0e (DSD_BLOCK), size = 10987 bytes
  metadata: ID = 0x2f (BLOCK_CHECKSUM), size = 4 bytes

stereo audio block, version 0x410, 13230 samples in 11134 bytes, time = 0.65-0.80
samples are 8 bits in 1 bytes, shifted 0 bits, sample rate = 88200
flags: INITIAL DSD CHECKSUM FINAL
  metadata: ID = 0x2a (NEW_CONFIG), size = 2 bytes
  metadata: ID = 0x0e (DSD_BLOCK), size = 11088 bytes
  metadata: ID = 0x2f (BLOCK_CHECKSUM), size = 4 bytes

non-audio block of 287208 bytes, version 0x410
  metadata: ID = 0x24 (ALT_TRAILER), size = 287166 bytes
  metadata: ID = 0x2f (BLOCK_CHECKSUM), size = 4 bytes

end of file

So, the DSD blocks are definitely in there. Not sure why your parsing would miss them (maybe the odd size?), but you should be able to compare this output to yours. Note that I am obviously not taking the value into account in this program (yet) which is why it shows a length of 0.80 seconds instead of 0.10 seconds.

Yeah, I totally agree that the ID mask should be 0x3F, so I'll fix that too. Sorry you had to run into that multiple times.... :)

Borewit added a commit to Borewit/music-metadata that referenced this issue Apr 22, 2019
Borewit added a commit to Borewit/music-metadata that referenced this issue Apr 22, 2019
Borewit added a commit to Borewit/music-metadata that referenced this issue Apr 22, 2019
@Borewit
Copy link
Author

Borewit commented Apr 22, 2019

Thanks a lot for all you help David!

DSD support, including DSD support for WavPack, has been added and released in version music-metadata v3.6.0.

@dbry
Copy link
Owner

dbry commented Apr 23, 2019

Cool, glad to help, thanks again for including WavPack DSD support in your code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants