Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues on porting FBOSS to an OpenNSL supported platform #60

Open
Lewis-Kang opened this issue Nov 15, 2017 · 9 comments
Open

issues on porting FBOSS to an OpenNSL supported platform #60

Lewis-Kang opened this issue Nov 15, 2017 · 9 comments

Comments

@Lewis-Kang
Copy link

Lewis-Kang commented Nov 15, 2017

Hi,

I try to run agent on an AS6812-32X switch which is running ONL (Open Network Linux).

Firstly, I link fboss_agent with the opennsl.so that works for AS6812-32X and build wedge_agent executable.

Secondly, I install all wedge_agent needed libraries (such as folly, glog, gflags,...) onto the switch.

Thirdly, I configure the switch to have more than 64 ports during opennsl driver initialization so as to use the /etc/fboss/sample1.json (got from fboss/agent/configs/sample1.json) configuration directly.

Then I run wedge_agent -mgmt_if=ma1 -can_warm_boot=false -mode=wedge -config=/etc/fboss/sample1.json, the console prints out the following two kinds of errors:

E0101 01:05:35.657263 2394 WedgeProductInfo.cpp:131] json parse error on line 0: expected json value
E0101 01:05:35.658212 2394 WedgeProductInfo.cpp:66] json parse error on line 0: expected json value

E0101 02:11:02.600317 4870 WedgePort.cpp:104] Error retrieving info for transceiver 0 Exception: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E0101 02:11:02.618465 4870 WedgePort.cpp:104] Error retrieving info for transceiver 0 Exception: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused


Could someone give me a guidance on fixing the above two errors?

By the way, is there a porting guide for us to follow for porting FBOSS to a new platform?

Thanks in advance.

-Lewis

@Lewis-Kang
Copy link
Author

Lewis-Kang commented Nov 18, 2017

Some more information (with the above two mentioned errors remained):

By using fboss_route.py list_ports, it can show correct port number, link status and admin enabled status. But present is wrong (I guess this requires extra porting for platform specific i2c slave address and offset)

e.g.,
root@localhost:~# fboss_route.py list_ports
Port 1: [enabled=True, up=True, present=None]
Port 2: [enabled=True, up=False, present=None]
Port 3: [enabled=True, up=False, present=None]
...

root@localhost:~# fboss_route.py disable_port 1

root@localhost:~# fboss_route.py list_ports
Port 1: [enabled=False, up=False, present=None]


And, it has issues on adding route entries.

e.g.,
root@localhost:~# fboss_route.py list_routes
Route 0.0.0.0/0 -->
Route ::/0 -->


root@localhost:~# fboss_route.py add 172.31.0.0/24 172.16.1.1
Traceback (most recent call last):
File "/usr/bin/fboss_route.py", line 254, in
args.func(args)
File "/usr/bin/fboss_route.py", line 48, in add_route
args.client, [UnicastRoute(dest=prefix, nextHopAddrs=nexthops)])
File "/usr/local/lib/python2.7/dist-packages/neteng/fboss/ctrl/FbossCtrl.py", line 11295, in addUnicastRoutes
self.recv_addUnicastRoutes()
File "/usr/local/lib/python2.7/dist-packages/neteng/fboss/ctrl/FbossCtrl.py", line 11317, in recv_addUnicastRoutes
raise result.error
neteng.fboss.ttypes.FbossBaseError: FbossBaseError(
message='switch is still initializing, FIB not synced yet',
message='switch is still initializing, FIB not synced yet')

@capveg
Copy link

capveg commented Nov 22, 2017

Hi Lewis,

There's a bunch of work that has to be done to port to another platform. You need a new config.bcm (which you probably have), but you also need to create fboss/agent/platform/PLATFORM.cpp code and link it into the right places to get the phy programming and other sundry things correct. Let me write up a proper document for how to do this.

@Lewis-Kang
Copy link
Author

Lewis-Kang commented Nov 22, 2017

Hi Rob,

Yes. I have the needed config.bcm. I use our built opennsl.so that works for AS6812-32X(running ONL) so the hardware port LED link status and color are all correct already.

I also install CLI tool and the needed python files to the switch, the CLI can run with correct Admin/State/Speed shown for cli.py port status command.

e.g.,
root@localhost:~# cli.py port status
Port Admin State Link State Transceiver Speed
-----------------------------------------------------------
1 Enabled Up Unknown 10G
...
13 Enabled Up Unknown 40G

I think all I need now is to figure out how to add/modify FBOSS code to support AS6812-32X.

Looking forward to your guidance.

Thanks in advance.

-Lewis

@capveg
Copy link

capveg commented Dec 16, 2017

So writing a formal doc for this has taken me longer than I wanted - apologies. Let me give you a slap dash answer that might unblock you.

For each new platform, you need the config.bcm (that gets passed as a command line option) and you also need to implement platform drivers in the agent (see the code in ./fboss/agent/platform/) as well as a platform driver for the qsfp_service (see ./fboss/qsfp_service/platform/).

There are a bunch of example platforms in those directories so hopefully the code will provide enough context.

Please let me know if you have more questions and I'll keep trying to get time to work on the platform.

@bluecmd
Copy link

bluecmd commented Jun 13, 2018

Hijacking this issue a bit. In general, how open is Facebook to merge additional platforms? I'm interested in porting AS5712-54X, and I will probably so it anyhow - but I'm happy to upstream it if it's something generally FBOSS wants.

@capveg
Copy link

capveg commented Jun 21, 2018

@bluecmd Sorry - missed this one. At one level, we'd love to accept the code. At another level, if it's a platform we're not actually using, we don't really have any decent way to test it. There's constant development going on with fboss and the likelihood that code for an untested platform would break is fairly high. We'd want to come up with some sort of external hardware-tested CI system before doing something like that and ... that just seems like a lot of work from where we are now.

Does that help?

@bluecmd
Copy link

bluecmd commented Jun 21, 2018

Sure. What I'm hearing is that I'm probably better off just maintaining a fork of fboss right now for my HW support, until such a time comes where you're ready to accept and maintain new platforms.

@capveg
Copy link

capveg commented Jun 21, 2018 via email

@bluecmd
Copy link

bluecmd commented Jun 21, 2018

The difference to me is the state they are in - my fork would be pretty experimental and not ready for merging until such a time comes when you folks know what kind of standard you want new platforms to adhere to. Nothing wrong with that, and I'm fine with having a long lived PR open if nothing else as an index of other platforms being worked on.

Nice idea with ONLP. I'd been thinking of adding Wedge support for it, so maybe an ONLP port could replace some Wedge specific code. Very interesting thought.

My larger picture is that I'm a fortunate hobbyist that has access to a couple of Wedge and an AS5712-54X. I'm also a bit tired of the kind of non-existent go-to solution for whitebox hobbyists. I think FBOSS+ONL could be that combination and I'd like to try to help out if I can.

However, I already did that once with the old OpenSwitch 1.0 by HPE that died, so I'm just making due diligence that Facebook appears committed to the open source version of FBOSS.

arajeev-ARISTA pushed a commit to arajeev-ARISTA/fboss that referenced this issue Sep 26, 2023
…ates

Meru800bia: update udev rules and add platform_init.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants