Category Archives: Linux

Troubleshooting Linux Storage 2023

As you may have seen, the first release of my book was published in late September, and I’ve received a great deal of positive feedback. To the ones who have provided me the constructive feedback, I am most grateful, and I’ll make sure these are addressed in the 2023 release.

One of the most asked questions was if I could expand on the Fibre-Channel and NVMeoFC side as that seems to be an area where many Linux administrators, who also deal with storage infrastructure management, have problems with. The main reason people asked is that I’ve been doing this for over 20 years so I must have some decent knowledge on this. They’ve followed my blog for a long time and would like to see the correlation of issues in a FC network and how this propagates onto the various layers of the operating system. Whether this is related to path management, IO issues, security, discovery or other problems that show up on Linux hosts, when it originates somewhere in the FC network it is often difficult to pinpoint the exact location of the issue.

I would be very happy to expand on this and share the knowledge that I have and provide examples with problems and resolutions.


Even though I’ve worked with the most complex and expensive equipment out there, I do not have a $100K home-lab sitting in my study. Recent FC equipment is relatively expensive when compared to Ethernet and there is no such thing as a free-bee Wireshark that can do line-rate FC traffic capturing or injecting errors like we have with “tc qdisc” options. Host bus adapters, a 16G or newer switch that can talk NVMeoF and has FPIN capabilities would already need to have some recent chipset and software. The same thing goes for an FC array.

I’m currently in touch with some good friends in the industry to see what the options are and if are able to accommodate my request. I know from experience that there are hurdles and roadblocks in the form of financial or legal restrictions so I need to take things as they come. I’m grateful for any effort people take to help me out.

If there are past, current or future customers who have “spare/superfluous” equipment in this area and are able/willing to help I would be extremely pleased.

It looks like this post seems to have turned out as some begging exercise, but that is not the intention. I am really committed to provide the best information that I can give to my readers and hope that they are able to prevent, or resolve, storage related issues in a Linux environment as much as possible. Having the proper tools to do that is obviously a prerequisite to achieve that.

If you are able to help and want to get in touch to see what we can do, just email me or make an appointment via the contact page over here.

Kind regards,


Using systemd-resolved to optimise DNS resolution.

When you work from home and are required to use the corporate network you’re often shoved into a dilemma where the VPN configuration that is pushed to your PC results in one of two modes, Full-tunnel or Split-Tunnel.

Digging tunnels

A full-tunnel configuration is by far the most dreadful especially when your VPN access-point is on the other side of the planet. Basically all traffic to and from your system is pushed through that tunnel. This is even the case when a web-page is hosted next-door from where you are sitting. Your requests to that webserver will first traverse via your VPN connection to the other side of the planet where your companies proxies will retrieve the page via the public web only to send it back to you via that VPN again. Obviously the round-trip and other delays will basically result in abominable performance and a user-experience that is excruciatingly painful.

A split-tunnel however is far more friendly. As I explained in one of my previous articles (here) only traffic destined for systems inside your corporate network will be routed over the VPN and requests to other systems will just traverse the public interweb. 

Domain Name resolution

There is however one exception DNS i.e. the name to (IP) number translation. Traditionally Linux uses a system-wide resolver that looks in “/etc/resolve.conf” what you DNS servers are and which domains to search for plus a few other options. That basically means that as soon as you have any VPN tunnel active you would always need to use your corporate DNS servers for any request as your system does not really know which server is located where. There may even be a situation that your corporate DNS servers point to a different host for the same domain. You often see this where employees get additional functionality than external users or credential verification may be bypassed as you already have an authorised session to the internal systems.

The drawback is however that sites outside your corporate network are also resolved via your companies DNS servers. This may not only be a limitation on performance from a resolver standpoint, remember that these DNS requests also have to traverse the same VPN tunnel, but the resulting system to where you end up may also not be the most appropriate one.

As an example.

If you have an active VPN to your, even DNS queries for a web-site in your country will first go “Corp DNS” who, if it does not already have a cached address itself, will forward that request to whatever “Corp DNS” has configured as its upstream DNS server. (In this case Google). As you can see you could’ve asked Googles DNS servers yourselves but as you VPN session has set your resolver to use the Corp DNS that does not happen. An additional point of attention is that you have to be aware of is that no matter which website you visit your company will have a record of that as most corporate regulations stipulate that actions done on their systems will be logged for whatever purpose they deem necessary. This may sometime conflict with different privacy policies in different countries but that is most often shuffled under the carpet and hidden in legal obscurity.

The above also means that when you have requests for sites that span geographies, you may not always get to the most optimal system. Many DNS system are able to determine where the request is coming from and subsequently provide a IP address of a system that is closest to the requestor. As your request is fulfilled by your companies’ DNS server on the other side of the planet, that web-server may also be there. Not to panic as many of these environments have build in smarts to re-direct you to a more local system it nevertheless means this situation is far from optimal. What you’re basically after is to have the ability to, in addition to that split-tunnel configuration, direct DNS queries to DNS servers which actually host the domains behind that VPN and nothing else.

In the above case your Linux system has two interfaces. One physical (WIFI or Ethernet) and one virtual (VPN most often called tunX where X is the VPN interface number)

Meet systemd-resolved

There are some Linux (or Unix) purists who shudder at the sight of systemd based services but I think most of them are actually pretty OK. Resolved is one of them.

What resolved allows you to do is assign specific DNS configurations to different interfaces in addition to generic global options.

As an example

LLMNR setting: yes
MulticastDNS setting: yes
DNSOverTLS setting: no
DNSSEC setting: allow-downgrade
DNSSEC supported: no
Fallback DNS Servers:

Link 22 (tun0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server:
DNS Servers:
DNS Domain:

Link 3 (wlp0s20f0u13)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: yes
DNSSEC supported: yes
Current DNS Server:
DNS Servers:
DNS Domain: ~.

As you can see it has three sections. The global section caters for many default settings which can be superseded by per-interface settings. I think overview speaks for itself. All requests to domain “” and “” will be sent to one of the two DNS servers with the 10.15.230.[6-7] adress. All my home internal requests as defined by the “” domain are sent to the address. The “~.” means all other requests.

That will result in queries being returned like:

[1729][erwin@monster:~]$ resolvectl query 10.xx.16.9 -- link: tun0
10.xx.16.8 -- link: tun0
172.xx.24.164 -- link: tun0
172.xx.24.162 -- link: tun0
10.xx.100.4 -- link: tun0
10.xx.148.66 -- link: tun0
10.xx.7.221 -- link: tun0
10.xx.7.34 -- link: tun0
10.xx.7.33 -- link: tun0
10.xx.100.5 -- link: tun0

-- Information acquired via protocol DNS in 243.1ms.
-- Data is authenticated: no

If I would use an external DNS system for that domain it would return different addresses.

[1733][erwin@monster:~]$ dig @ +short

(The above are not my real domains I queried but I think you get the drift)

Queries to non-corporate websites will be retrieved via the WIFI interface (wlp0s20f0u13)

[1733][erwin@monster:~]$ resolvectl query 2404:6800:4006:809::200e -- link: wlp0s20f0u13 -- link: wlp0s20f0u13

-- Information acquired via protocol DNS in 121.0ms.
-- Data is authenticated: no

As my home router has a somewhat more sophisticated setup this also allows me to have all external DNS requests, not destined to or, use a DNSoverHTTPS or DNSoverTLS configuration to bypass any ISP mangling.


Systemd-resolved is a systemd service (duhh) which needs to be enabled first with “systemctl enable systemd-resolved“. The configuration files are located in /etc/systemd/resolved.conf or in a .d subdirectory of that where individual configuration files can be stored.

-rw-r--r-- 1 root root 784 Oct 20 14:32 resolved.conf
drwxr-xr-x 2 root root 4096 Oct 20 14:24 resolved.conf.d/

The settings can also be applied interactively via the “resolvectl” command which I have done. If your distro has NetworkManager installed then NM can also automatically configure resolved via D-bus calls.

There is more involved than I can easily simplify here as it would pretty quickly become a re-wording of the man-page which I try to avoid. At least I hope it has given you some information of what you can do with “systemd-resolved

Kind regards,


Enabling Verbose Logging on Linux with Emulex Host Bus Adapters

Where did my disks go?

So now and then you may run into an issue which cannot be explained properly by just looking at the standard events that show up in “/var/log/messages“.

Issues such as

Oct 7 18:24:20 centos8 kernel: lpfc 0000:81:00.0: 0:1305 Link Down Event xc received Data: xc x20 x800110 x0 x0
Oct 7 18:24:24 centos8 kernel: rport-11:0-4: blocked FC remote port time out: removing target and saving binding
Oct 7 18:24:24 centos8 kernel: lpfc 0000:81:00.0: 0:(0):0203 Devloss timeout on WWPN 50:06:0e:80:07:c3:70:00 NPort x01ee40 Data: x0 x8 x2

are fairly common and the above simply shows a Link Down event. These are the most easy to troubleshoot when the remote switchlog tell you

Continue reading

Reducing MFA/2FA requests on cloud apps


Third party authentication and authorisation providers like okta, azure, gcs or aws often have a trusted connection to the tenants. This sometimes allows that authentication requests via MFA/2FA options can be bypassed as the authentication has already occured from inside the tennants network.
When employees work from remote locations they can set up a VPN to their companies network in one of two modes.

  • Full Tunnel – this causes ALL traffic to travers the VPN to the companies network and then is propagated to their internal server or via firewalls and proxies to the internet.
  • Split Tunnel – Only traffic destined for the subnet routes that get pushed from the vpn server will traverse the vpn tunnel.

The full tunnel setup may be helpful if you only work with systems inside your corporate network. Given the fact vast amount of application are now published in some obscure place called “The Cloud” you basically have no clue where it resides.

I’ve created a script pushed to github (over here) that creates specific routes based on your settings that may result in a reduction on your MFA/2FA requests to be validated.

Have a look at the “README” for more info.


Getting rid of whitespace

No, not storage related but more towards coding scripts etc and assuring your git repositories do not show up with huge diff sections you need to correct. Just a little tip and a “note to self”.

If you’ve event been keen enough to not use an IDE for whatever language you use and kept to a real editor (VIM obviously.. :-)) you may have encountered the phenomenon that whitespace at the end of lines is a nasty thing to look at when you start putting stuff into version control repositories like Subversion or GIT. A little change from some copy or past action may leave you with a “git diff” of a couple of hundred lines you need to correct.

To fix that simply let VIM clear out all empty whitespace (tabs, spaces, etc.) by having these removed before the actual write to disk.

To do that simply add

autocmd BufWritePre *.sh :%s/\s\+$//e

to your ~/.vimrc and with every :w the substitute function driven by the regex after the colon will remove it all in all shell scripts (*.sh). Obviously you can add every extension you need here.  Very handy.



Hitachi Va….. what???

VANTARA. Hitachi Vantara. Yes, took me a while to get used to it as well after almost 15 years at HDS but I must say reading all the internal and external communications and overall business and technical transformation gets me more excited by the day.

HDS, Pentaho and Hitachi Insight Group are combined into a new company focussing on IoT, Analytics and helping customers obtain the maximum use and benefit out of their operational technologies via bullet proof IT from HDS.

Two months ago I had solar panels installed. Living in the Australian Sunshine Coast Hinterland I thought I’d let the name and location do some work for me and as a positive side-effect bring my power bills down as well. In addition to that I’ll prevent a few tonnes of CO2 emissions by not having the power companies burn black rock. The challenge is however to make the best use out of the solar installation. As I did not opt for a battery installation (simply too much money and the ROI makes it currently not worth it) the panels deliver between 4 and 5KWh on average depending on a few factors like cloud overcast, angle of the sun, shade of trees etc. Continue reading

Performance misconceptions on storage networks

The piece of spinning Fe3O4 (ie rust) is by far the slowest piece of equipment in the IO stack. Heck, they didn’t invent SSD and Flash for nothing, right. To overcome the terrible latency, involved when a host system requests a block of data, there are numerous layers of software and hardware that try to reduce the impact of physical disk related drag.

One of the most important is using cache. Whether that is CPU L2/L3 cache, DRAM cache or some hardware buffering device in the host system or even huge caches in the storage subsystems. All these can, can will, be used by numerous layers of the IO stack as each cache-hit means it prevents fetching data from a disk. (As in intro into this post you might read one I’ve written over here which explains what happens where when a IO request reaches a disk.)

Continue reading

Open Source Storage (part 2)

Six years ago I wrote this article : Open Source Storage in which I described that storage will become “Software Defined”. Basically I already predicted SDN before the acronym was even invented.  What I did not see coming is that Oracle would buy SUN and by doing that basically killing off the entire “Open Source” part of that article but hey, at least you can call yourself a Americas Cup sponsor and Larry Elisons yacht maintainer. 🙂

Fast forwarding 6 years and we land in 2015 we see the software defined storage landscape has expanded massively. Not only is there a huge amount of different solutions available now but the majority of them have evolved into a mature storage platform with almost infinite scalability towards capacity and performance.

Continue reading