-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent read operation?' #356
Comments
You can do things in parallel, there should not be any limitation there. I will be of the grid for at least 1 week. If you can prepare reliable reproduction, that would be very helpful. |
Sorry I am not used to developers mainteners to answer that fast haha. Thanks!! |
@iwaduarte, you cannot read from a stream twice, neither from a tokenizer. You can use const fileType = require("file-type");
const musicMetadata = require("music-metadata");
const strtok3 = require('strtok3')
const axios = require('axios');
const url = 'https://test-audio.netlify.app/' + encodeURI('Various Artists - 2008 - netBloc Vol 13 - Color in a World of Monochrome [AAC-40]') + '/' + encodeURI('1.01. Sweet Man Like Me.m4a');
(async () => {
const someHttpRequestStream = await axios.get(url, { responseType: "stream"})
const stream = await fileType.stream(someHttpRequestStream.data);
if (stream.fileType && stream.fileType.mime.startsWith('audio/')) {
console.log('Found audio file' + stream.fileType.mime);
const tokenizer = await strtok3.fromStream(stream);
const mm = await musicMetadata.parseFromTokenizer(tokenizer);
console.log(mm);
}
})(); on runkit.com Under normal conditions you would rely on the HTTP mime type: const musicMetadata = require("music-metadata");
const strtok3 = require('strtok3')
const axios = require('axios');
const url = 'https://test-audio.netlify.app/' + encodeURI('Various Artists - 2008 - netBloc Vol 13 - Color in a World of Monochrome [AAC-40]') + '/' + encodeURI('1.01. Sweet Man Like Me.m4a');
(async () => {
const someHttpRequestStream = await axios.get(url, { responseType: "stream"})
if (someHttpRequestStream.headers['content-type'] && someHttpRequestStream.headers['content-type'].startsWith('audio/')) {
const mm = await musicMetadata.parseStream(someHttpRequestStream.data);
console.log(mm);
}
})(); If you trust the HTTP MIME-type you can pass that to const musicMetadata = require("music-metadata");
const strtok3 = require('strtok3')
const axios = require('axios');
const url = 'https://test-audio.netlify.app/' + encodeURI('Various Artists - 2008 - netBloc Vol 13 - Color in a World of Monochrome [AAC-40]') + '/' + encodeURI('1.01. Sweet Man Like Me.m4a');
(async () => {
const someHttpRequestStream = await axios.get(url, { responseType: "stream"});
const contentType = someHttpRequestStream.headers['content-type'];
if (contentType && contentType.startsWith('audio/')) {
const mm = await musicMetadata.parseStream(someHttpRequestStream.data, {mimeType: contentType});
console.log(mm);
}
})(); Yet another option is to read from the URL twice. Once for the type, the second type for the data. |
I have created 2 [stream.PassThrough] https://nodejs.org/api/stream.html#stream_class_stream_passthrough )(transform streams) and then I would send them directly to music-metadata and file-type stream methods. It does not work, it fails silently. However, if I use music-metadata with writing to the disk they work flawlessly.
As you can see above I am reading concurrently the same stream. So I do not know why would that not work for those two packages. Am I missing something ?
Unfortunately I can not trust mime. I need the library to be able to determine if I have a valid file or not.
That is ineffective and not a solution for my use case. I would ask you also. Why could I not force use file-type inside the music-metadata as an option in the constructor configuration? Let's say I do not want to provide anything but let music-metadata to figure out that for me using file-type. Would be hard to implement that feature ? I know it is another repo. But if you think that could be done I could create a PR or issue there and continue the discussion |
Apparently you can create kind of 2 duplicate streams with |
They are not independent. They rely on the main source and its backpressure. So if one of passthrough stream is not reading it will stop the main stream. Therefore, I believe the fix would be to close the stream after stop reading. I am considering this error is on file-type I will try some tests at my side to confirm. You have not answered my final question though. |
That is actually the way it's done @iwaduarte. I case the file type is not provided by MIME-type or extension, it will fallback on file-type auto detection. Yet this is not recommanded, as file-type is guessed on no more then an inital portion of the file. So if you have MP3, with 500 kb ID3v2 tag header (which is realistic if a large cover is embedded), it is likely not able to figure out is an MP3 file. |
I know that it is the way you have it (I have checked the codebase). But what I meant is to have the output from file-type. Because it is more user friendly. For instance { ext: 'mp3', mime: 'audio/mpeg' } is better than 'MPEG 1 Layer 3' or {ext: 'wav', mime: 'audio/vnd.wave'} is better than 'WAVE' and so on. I had to do a map sometimes on codec and sometimes on extension using your package. and that is very fragile. I know that file-types operates using magic numbers. Of course would be better to provide the mime-type to your library but that would require use of file-type first (in my use case). Do you get my point here ? And I haven't also seeing any referencing in their npm repo to the problem you are pointing out. (I am guessing you could take longer to identify these ID3V2 tag header ? Or this is a very specific use case ?). In summary what I would like is to have the mime and ext offered in the final output. Specifically when the music-metadata had to rely on file-type. Does that make sense ? |
I am actually the one who raised the issue of the 4100 byte sample and I have refactored file-type overcode that issue. Only the fileTypeStream(readableStream, options?) has that limitation, since very recent now on a user defined sample size.
I wanted to show you the limitation with the following example: const fs = require("fs");
const FileType = require('file-type');
const path = 'fixture/02 - Poxfil - Solid Ground.mp3';
(async () => {
// File type detection with full file access
const fileType = await FileType.fromFile(path);
console.log(fileType); // expected to detect the MP3
// File type detection use preceeding sample of 4100 bytes
const stream1 = fs.createReadStream(path);
const stream2 = await FileType.stream(stream1);
console.log(stream2.fileType); // expected to not detect MP3
})(); The MP3 file: Actually it returned the right file type. The output was:
I forgot that if just the ID3 header is found it will guess it's an MP3. I can promise you, I can make it guess wrong, as there are a few audio formats who also use ID3v2 headers. Point is, for the best type determination, it may require parsing a very large portion of the file. That get you in trouble, if you read from a stream, want the super duper detection, and then decide what to do with the stream. For that very same reason I cannot rely on the super duper file detection of file-type in music-metadata. Can the audio file type be specified even better potentially in music-metadata? Yes that would be possible. Should music-metadata be kind of plugin then of file-type. We thought about specialized plugins indeed. Can we return the most suitable mime-type based on the content in music-metadata? Yeah, that can be made so, |
Awesome example but I guess I could also make music-metadata guess wrong if I provide the wrong mime. Couldn't I ? I think I have tested that before.
I do not understand here. Are you saying that music-metadata goes further in detecting after file-type step ? I thought it was 2 ordered steps:
I think that it is not the way to go is it ? I guess metadata surpass file type extension. How would you put let's say duration property inside file-type ?
Awesome. Should I create an issue for that (or maybe we should just refer this discussion) ? |
I am sorry, no, you cannot. If the MIME-type is provided there is nothing to guess. You probably mean, that you can provide non complaint data and you get some error. Sure.
The get music-metadata start parsing, the right container (parser) must be selected. As explained above, preferably the MIME-type or file extension is used for that. Only when those are not available, file-type is used for content based detection. Some extensions or MIME-type are depending on the codec being used. For example, somewhere, in the ASF parser we may say, this should be an
Maybe it is good if you create a new issue and summary what the plan is. As this subject is my opinion tricky and easily causes misunderstandings, I think it is import to clear upfront on what the change should be. |
I'm seeing something similar here; where I have two PassThrough streams; into which I've piped a main stream; the The blocking async call seems to be the A simple, hacky workaround I implemented was to "end" the PassThrough stream given to file-type as soon as it becomes readable after making the call to
I created a more detailed report here: sindresorhus/file-type#532 (comment) |
I am trying to read audio/video metadata with two distinct packages that uses your implementation.
I got an error that says:
Error: Concurrent read operation?
peek-readable/lib/StreamReader.ts
Line 106 in 9f9ce03
Is there anyway that I can achieve that ? I thought that maybe having a interface that allowed me to extract mimeType and extension from the library music-metadata (since it uses internally file-type if fileInfo is not provided) would be the best .
But since I could not find a way of getting duration, mime and extension in one go. I am looking for alternatives to use these packages in parallel by having an intermediate passthrough stream.
Could you help me here ?
@Borewit
The text was updated successfully, but these errors were encountered: