Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential infinite loop bug in ixgbe driver #960

Open
Hoblovski opened this issue May 12, 2023 · 5 comments
Open

Potential infinite loop bug in ixgbe driver #960

Hoblovski opened this issue May 12, 2023 · 5 comments
Assignees

Comments

@Hoblovski
Copy link

Hi Theseus community, I would like to report a potential bug.

The following lines of code contains a potential infinite loop bug.

// kernel/ixgbe/src/lib.rs: IxgbeNic::rx_init
            while rxq.rxdctl.read() & RX_Q_ENABLE == 0 {}

The bug is triggered when the device misbehaves, and generates faulty register values that never has the receive enable bit.
The driver will be stuck in this loop, resulting in denial of service.

The linux ixgbe driver avoids the problem via a counter bounding the loop, like

// drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

#define IXGBE_MAX_RX_DESC_POLL 10
static void ixgbe_rx_desc_queue_enable(...) {
	int wait_loop = IXGBE_MAX_RX_DESC_POLL;
	// ...
	do {
		usleep_range(1000, 2000);
		rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(reg_idx));
	} while (--wait_loop && !(rxdctl & IXGBE_RXDCTL_ENABLE));

	if (!wait_loop) {
		e_err(drv, "RXDCTL.ENABLE on Rx queue %d not set within "
		      "the polling period\n", reg_idx);
	}
}

Can we confirm this bug? Also it can be fixed simply via a counter like the linux driver.
I've found a few similar cases but this one is representative.

Thank you very much.

@kevinaboos
Copy link
Member

Thanks for the report! I do agree that we need a timeout there.

I believe that @Ramla-I is working on a series of formally-verified device drivers for the NICs in Theseus, so this is likely already in progress or being addressed with a redesign.

In any case, thanks for bringing this up, and feel free to identify the other places in the code you've found such loops in this issue too.

@Ramla-I
Copy link
Member

Ramla-I commented Jun 12, 2023

Yes, this is a bug and should be addressed. We've just been lucky that we've never been stuck in the infinite loop. I'll make sure to address these changes in a future driver PR, but let's keep this issue open till it's resolved.

@Ramla-I Ramla-I self-assigned this Jun 12, 2023
@Hoblovski
Copy link
Author

I'm awfully sorry for the late reply, but thank you for the feedback.

Yes there're two more cases, and I've cross checked with linux code to filter out false positives, and the rest are:

  • ixgbe lib.rs:917
    • This one's similar, just TX instead of RX.
  • ixgbe lib.rs:704
    • This one's suspicious, and the linux ixgbe seems like bounded by timeout
    // linux/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
      	if (hw->eeprom.ops.read(hw, ++data_offset, &data_value))
      		goto setup_sfp_err;
      	while (data_value != 0xffff) {
      		IXGBE_WRITE_REG(hw, IXGBE_CORECTL, data_value);
      		IXGBE_WRITE_FLUSH(hw);
      		if (hw->eeprom.ops.read(hw, ++data_offset, &data_value))
      			goto setup_sfp_err;
      	}

Still I'm sorry for the late reply and thank you again for the feedback.

@Hoblovski
Copy link
Author

Hoblovski commented Jul 4, 2023

Hi @Ramla-I @kevinaboos , sorry for disturbing. I wonder if the latest bugs mentioned in the last reply can be confirmed. Thank you very much!

@Ramla-I
Copy link
Member

Ramla-I commented Jul 4, 2023

Right, these are both infinite loop bugs. Thanks for pointing them out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants