Category Archives: LST book

Articles posted here are based on feedback from system administrators, storage architects, infrastructure designers and support people who purchased the “Linux Storage Troubleshooting” book.
Their contribution and help are invaluable for all involved in operating a storage ecosystem linked to a Linux based server infrastructure.

Troubleshooting Linux Storage 2023

As you may have seen, the first release of my book was published in late September, and I’ve received a great deal of positive feedback. To the ones who have provided me the constructive feedback, I am most grateful, and I’ll make sure these are addressed in the 2023 release.

One of the most asked questions was if I could expand on the Fibre-Channel and NVMeoFC side as that seems to be an area where many Linux administrators, who also deal with storage infrastructure management, have problems with. The main reason people asked is that I’ve been doing this for over 20 years so I must have some decent knowledge on this. They’ve followed my blog for a long time and would like to see the correlation of issues in a FC network and how this propagates onto the various layers of the operating system. Whether this is related to path management, IO issues, security, discovery or other problems that show up on Linux hosts, when it originates somewhere in the FC network it is often difficult to pinpoint the exact location of the issue.

I would be very happy to expand on this and share the knowledge that I have and provide examples with problems and resolutions.

Hardware

Even though I’ve worked with the most complex and expensive equipment out there, I do not have a $100K home-lab sitting in my study. Recent FC equipment is relatively expensive when compared to Ethernet and there is no such thing as a free-bee Wireshark that can do line-rate FC traffic capturing or injecting errors like we have with “tc qdisc” options. Host bus adapters, a 16G or newer switch that can talk NVMeoF and has FPIN capabilities would already need to have some recent chipset and software. The same thing goes for an FC array.

I’m currently in touch with some good friends in the industry to see what the options are and if are able to accommodate my request. I know from experience that there are hurdles and roadblocks in the form of financial or legal restrictions so I need to take things as they come. I’m grateful for any effort people take to help me out.

If there are past, current or future customers who have “spare/superfluous” equipment in this area and are able/willing to help I would be extremely pleased.

It looks like this post seems to have turned out as some begging exercise, but that is not the intention. I am really committed to provide the best information that I can give to my readers and hope that they are able to prevent, or resolve, storage related issues in a Linux environment as much as possible. Having the proper tools to do that is obviously a prerequisite to achieve that.

If you are able to help and want to get in touch to see what we can do, just email me or make an appointment via the contact page over here.

Kind regards,

Erwin

Enabling Verbose Logging on Linux with Emulex Host Bus Adapters

Where did my disks go?

So now and then you may run into an issue which cannot be explained properly by just looking at the standard events that show up in “/var/log/messages“.

Issues such as

Oct 7 18:24:20 centos8 kernel: lpfc 0000:81:00.0: 0:1305 Link Down Event xc received Data: xc x20 x800110 x0 x0
Oct 7 18:24:24 centos8 kernel: rport-11:0-4: blocked FC remote port time out: removing target and saving binding
Oct 7 18:24:24 centos8 kernel: lpfc 0000:81:00.0: 0:(0):0203 Devloss timeout on WWPN 50:06:0e:80:07:c3:70:00 NPort x01ee40 Data: x0 x8 x2

are fairly common and the above simply shows a Link Down event. These are the most easy to troubleshoot when the remote switchlog tell you

Continue reading