Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpuinfo: fixes and enhancements #297

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

kolyshkin
Copy link

@kolyshkin kolyshkin commented May 18, 2020

This is a set of commits aiming to simplfy, fix and improve cpuinfo parsers.

Please review commit by commit. Below is a copy-paste of individual commit descriptions.

cpuinfo_test: fix test data

Real /proc/cpuinfo files do not contain empty lines at the start
or at the end of file.

parseCPUInfoARM: fix for kernel 3.7+

Since Linux kernel 3.8-rc1 (commit b4b8f770eb10a "ARM: kernel:
update cpuinfo to print all online CPUs features", Sep 10 2012)
the kernel does not print "Processor" as the first line of
/proc/cpuinfo. Instead, it adds a line named "model name"
with similar contents.

Fix the code for this case, and add more tests.

Fixes #294

parseCPUInfoARM: only parse Features once

The `Features` line is the same for all CPUs, and yet it is assigned to
and parsed many times.

Optimize the code so it is only parsed once and assigned in place.

parseCPUInfo: optimize line scan

Avoid double check of each line for ":" -- since we're splitting the line anyway,
we can reuse the result.

This also prevents a potential panic of referencing non-existen field[1]
in case input contains a ":" with no space after.

cpuinfo: read data in place

Instead of reading the data to the buffer, then making a reader and a
scanner out of that buffer, let's analyze the data as we read it line by
line.

parseCPUInfo*: add error checking

In case the input can't be read, scanner.Scan() stops and the code is
supposed to check for error using scanner.Err(). Do that.

In other words, instead of returning half-read cpuinfo, return an error.

cpuinfo: rm GetFirstNonEmptyLine, simplify scan

There is no sense to skip empty lines at the beginning of the file since
there are none (there were in test data but it is fixed by an earlier
commit).

Logic is simpler now.

Real /proc/cpuinfo files do not contain empty lines at the start
or at the end of file.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
cpuinfo.go Outdated
case "processor":
cpuinfo = append(cpuinfo, commonCPUInfo) // start of the next processor
cpuinfo = append(cpuinfo, CPUInfo{VendorID: modelName}) // start of the next processor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link appears to be not as helpful as I wanted, but the conversation was about that "processor" is only present for CONFIG_SMP (pre v3.8) kernels, so you likely need to figure out how to support kernels without that option.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I have figured it out, please see the updated commit 928d8ea, to which I have added a test case with data from the above comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Cool!

Copy link
Author

@kolyshkin kolyshkin May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluecmd can you please review it then? I am aware you're not a maintainer but you wrote some of this code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write any of this code, I just filed bugs against it :-)

Since Linux kernel 3.8-rc1 (commit b4b8f770eb10a "ARM: kernel:
update cpuinfo to print all online CPUs features", Sep 10 2012)
the kernel does not print "Processor" as the first line of
/proc/cpuinfo. Instead, it adds a line named "model name"
with similar contents.

Fix the code for this case, and add more tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The `Features` line is the same for all CPUs, and yet it is assigned to
and parsed many times.

Optimize the code so it is only parsed once and assigned in place.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Avoid double check of each line for ":" -- since we're splitting the line anyway,
we can reuse the result.

This also prevents a potential panic of referencing non-existen field[1]
in case input contains a ":" with no space after.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of reading the data to the buffer, then making a reader and a
scanner out of that buffer, let's analyze the data as we read it line by
line.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case the input can't be read, scanner.Scan() stops and the code is
supposed to check for error using scanner.Err(). Do that.

In other words, instead of returning half-read cpuinfo, return an error.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There is no sense to skip empty lines at the beginning of the file since
there are none (there were in test data but it is fixed by an earlier
commit).

Logic is simpler now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Author

@discordianfish @pgier PTAL

}

func parseCPUInfoX86(info []byte) ([]CPUInfo, error) {
scanner := bufio.NewScanner(bytes.NewReader(info))
func parseCPUInfoX86(info io.Reader) ([]CPUInfo, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we decided to prefer reading everything in a buffer first. While this is unlikely to change mid reading, I think it's what we decided to to in general. But not sure, @SuperQ wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been doing that previously, although I think this pattern is fine also. The OS is probably buffering the file content anyway. And I think io.Reader is better to use as a parameter for most parseXXX functions because it's a little more generic than a string or byte slice.

If we do want to buffer the entire file first, then it would probably be better to create a reader from the byte slice and then pass the reader to the parse function instead of passing the byte slice directly. The byte slice parameter here might just be left over from me thinking I would need random access instead of line by line access to parse certain file formats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds reasonable, then let's keep is a reader.

Copy link
Collaborator

@pgier pgier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, thanks for the contribution! I think this will have a couple minor conflicts with #296, so this might need one more update after that one is merged.

@@ -183,22 +172,16 @@ func parseCPUInfoX86(info []byte) ([]CPUInfo, error) {
cpuinfo[i].PowerManagement = field[1]
}
}
return cpuinfo, nil
return cpuinfo, scanner.Err()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If scanner.Err() is not nil here, I'm wondering if we should return a nil instead of cpuinfo. At least that's the pattern we've usually followed for other parts of this library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if scanner.Err() is not nil here, I'm wondering if we should return a nil instead of cpuinfo. At least that's the pattern we've usually followed for other parts of this library.

It seems like an unnecessary complication. A user should always check for the error first. If the error is non-nil, there are no guarantees wrt what the result would be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense, I'm fine with this as-is.

@discordianfish
Copy link
Member

Yeah needs rebasing but also LGTM in general.

@kolyshkin
Copy link
Author

I think this will have a couple minor conflicts with #296, so this might need one more update after that one is merged.

Alas, this PR also fixes the same issue as #296 so it did not have to be merged first.

Anyway, let me rework this on top of it.

@kolyshkin kolyshkin marked this pull request as draft June 9, 2020 15:57
@SuperQ
Copy link
Member

SuperQ commented Aug 24, 2020

Any interest in continuing this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CPUInfo for ARM and Linux kernels newer than 3.8
5 participants