
J = Asc(Mid(txt, nC)) ' Int(crc / 2) Then mask = &HA001Ĭrc = Int(crc / 2) And &H7FFF: crc = crc Xor mask S3 = Mid(s, 2 * l3 + 1) ' the rest of the string. ' create a 12 character hash from string sĭim s1 As String, s2 As String, s3 As String
VBA EXCEL CONVERT STRING TO LONG CODE
Things work better - every ASCII code at least starts out life as its own value.Ĭombining this change with the proposal I made earlier, you get the following code: Function hash12(s As String) If you replace the line with j = asc(mid(txt, nC, 1)) That anything good comes out at all is a miracle. This only handles text that can be interpreted as hex values: lowercase and uppercase letters are the same, and anything after F in the alphabet is ignored (as far as I can tell). There is a serious flaw in your original CRC16 code - namely the line that says: j = Val("&H" + Mid(txt, nC, 2)) I could write the code but based on the quality of the question I think you can take it from here!ĮDIT: It turns out that that advice is not enough. Run your "short" algorithm on each, and concatenate the results.

Split your string into three shorter strings (if not divisible by three, the last one will be longer than the other two). Here is my test workbook with all example functions and a big number of test strings.

hash is a 40 characters long HEX string.Function: =SHA1TRUNC(A1) with this Code.8 digits long hash = 18 collisions in 6895 lines = 0.23 % collision rate.characters to reduce collision rate even more Function: =CRC16TWICE(A1) with this Code.5 digits long hash = 616 collisions in 6895 lines = 8.9 % collision rate.Function: =CRC16NUMERIC(A1) with this Code.4 digits long hash = 624 collisions in 6895 lines = 9 % collision rate.Function: =CRC16HASH(A1) with this Code.I don't take credit for the code and all sources are referenced. I have collected some different functions to generate a short hash of a string in VBA.

Q: How do I get a string hash which is short enough (12 chars) and long enough to get a small percentage of duplicates. Paste them to column A in a new Excel workbook and execute the code above. You can copy these 400 test strings from pastebin.

If Crc / 2 Int(Crc / 2) Then mask = &HA001Ĭrc = Int(Crc / 2) And &H7FFF: Crc = Crc Xor mask The chance to generate a collision is too high. In my test with 400 strings, 20% got a duplicate somewhere else. I thought this SO answer is a good start since it generates a 4-digit Hex-Code (CRC16).īut 4 digits were too few.
