Please enable javascript, or click here to visit my ecommerce web site powered by Shopify.

Community Forum > Snapshots - LUN assignment > 255 "bug" - no access from ESXi

Hi again....

I was playing with the very nice StorageVolume snapshot features... really a great feature!
I have configured some hourly/daily/weekly... snapshots just to see how is it working, what kind of changes there are in such time periods etc.
Yesterday, after quite some time not doing this, I tried to mount an older snapshot volume snapshot to get some data I managed to destroy.

I have two esx hosts, connected to QS via FC.
I assigned the snapshot to the hosts and at first all seemd "like usual"... but my esx hosts (v5.5) did not see any "new" drives...
After a little peeking arrount I noticed the snapshot volume was assigned the FC SCSI LUN 527 - a little "strange" LUN ID I would say...

OK, after some more peeking and digging I finally found some answers...
The documents from VMWare says max LUN ID supported is 255 and I also found some other references and tests confirming that: http://sflanders.net/2011/10/02/esxi-lun-id-maximum/

So.... is there a way to change the LUN assignment for the volumes? I know your documentation says "LUNs are assigned automatically" but.... that assignment is not really "useful" it seems.
There is no problem with the iSCSI access as there the LUN is always 0... but for FC... it's a real problem...
Is it really possible that noone noticed this before??

QS version 4.2.0.375

Any suggestion?

Best regards,
M.Culibrk

March 12, 2017 | Registered CommenterM.Culibrk

Hello,

In this case, you should be able to use the lun id with the latest version of vSphere, which supports lun id's up to 1023. If you have a specific need beyond this, then perhaps some more information about the issue you are having would help to provide better understanding.

Also, would you be able to clarify if this is for the free community edition? Otherwise, are you looking to consult or become a partner and have direct support access? If so, you can contact Sales Engineering at sdr@osnexus.com if you have a particular customer and pending sale up for discussion.

Thanks

March 13, 2017 | Registered CommenterAaron Knodel

Hi M,
Just to add to Aaron's feedback, I think that we could be doing a better job in picking FC LUN numbers. With iSCSI we are always using LUN 0 because each volume and snapshot has a unique IQN with the device on LUN 0. With FC the model is different and we have to account for ALUA environments where the same volume is presented from multiple appliances and we want those to be exposed with the same LUN number. I think that the issue is that we're picking the LUN number too soon, essentially we should have "lazy" allocation of LUN numbers once the storage is assigned. That would prevent these scenarios where you're seeing these high LUN numbers. Could you share the number number of volumes and snapshots you have in your configuration?
Best,
-Steve

April 3, 2017 | Registered CommenterSteve

Hi there!

Sorry for this really long response time....

The issue is exactly as you described it! You allocate LUN numbers for volumes immediately on creation instead of allocating them when assigned to some host.

I have something like 10 volumes active and few snapshot schedules which run each 1-2 hours (first one) and it keeps 3-4 snapshots and the other which executes daily making kind of "daily/weekly/monthly" "backup snapshot".

LUN numbers are going really high right now... I think it's at approx 1750 right now...

example:

Compression Ratio 1.54x
Created By admin
Created Time Stamp Sun Oct 29 23:01:01 GMT+100 2017
Description Auto-generated by snapshot schedule 'Weekly'.
IQN iqn.2009-10.com.osnexus:ecf9ea6f-53c6794bd7079c10:LJ-01-SAS1.GMT20171029.220101
Internal ID 53c6794b-d707-9c10-6422-0e9e2566afbc
Internal Location /dev/zvol/qs-ecf9ea6f-975f-7738-1f12-d74accbc2de4/53c6794b-d707-9c10-6422-0e9e2566afbc
Is Snapshot? true
Is Thin Provisioned? true

LUN 1684

Modified By admin
Modified Time Stamp Sun Oct 29 23:01:01 GMT+100 2017
Name LJ-01-SAS1_GMT20171029_220101
Owner(s) admin

If you have any idea on how to solve this... it would be really, really appreciated!

Best regards,
M.Culibrk

November 17, 2017 | Registered CommenterM.Culibrk

Hi M,
Technically, you could edit the StorageVolume rows in our internal SQLite database under /var/opt/osnexus/quantastor/osn.db to re-order the LUN numbers. WARNING: This falls under the "at-your-own-risk" category as you can easily mess up the database and you'd want to do be sure that the the service is stopped 'service quantastor stop' and your clients are all disconnected etc, etc, and you'd need to reboot the appliance after the change so that the target driver is all reloaded and such. I'll check with engineering to see about getting this into the roadmap for v4.5 now that v4.4 has shipped so that you don't need to do hacky stuff like what I describe above.
Best,
-Steve

November 17, 2017 | Registered CommenterSteve

Thanks for the quick response!

Yeah... I already kinda did what you suggested... some time ago when I really needed to access a snapshot for recovery... and it worked... and saved my day (or better a few nights).

Anyway... the thing that I do not get/understand is.. how that no one else noticed/hit that problem??
It's kind of scary when you "count" on the snapshots being there, and all seems right, till the moment you actually need/try to use the volume/lun... no errors of notifications of any kind on any side... just "the lun is not there to be seen". Panic!
But, after a few "tranquilizer" shots and a few deeep breaths things slowly unhide from the shadows... :)

So, thanks for all the help & support, and a great piece of SW!

Regards,
M.Culibrk

November 18, 2017 | Registered CommenterM.Culibrk

M, thanks for the feedback on this. We're going to address this in v4.4.1 as a hot fix. The plan is to make the LUN numbers dynamically assigned at the point in time a Storage Volume is assigned to a Host/Host Group. Unassignment and or deletion of a Storage Volume will return the LUN number to the available list so that it may be reused. Further, we're looking at the option to assign static LUN numbers so that they can be made sticky between assign./unassign operations.
Best,
-Steve

November 20, 2017 | Registered CommenterSteve

M, we've fixed the issue in v4.4.1 and we're just finishing up the testing. Look for the new QuantaStor v4.4.1 release on Monday which will address this.
Best,
-Steve

December 15, 2017 | Registered CommenterSteve