1
I'm using Python (2.7) and its re module for recognition of various Bitcoin data, namely:
addresses, DER sigs, OP Return hexdata, TxIDs
I've been using re.compile: for example, for a valid Tx hash (TxID):
RE_TXHASH = re.compile('^[0-9a-fA-F]{64}$')
if bool(re.match(RE_TXHASH, 'f'*64)): # 'f'*63 would fail
print 'valid tx hash!'
else:
raise Exception("invalid tx hash!")
I'm looking for regex patterns for the following (or input on my current best attempt):
DER signatures: General format
"30[sig_size]02[r_size]02[s_size][sighash]"Bitcoin addresses:
re.compile('^[123mn]{1}[a-km-zA-HJ-NP-Z0-9]{26,33}$')OP Return hex strings:
re.compile('^(6a){1}[a-fA-F0-9]{0,80}$')TxIDs:
re.compile('^[0-9a-fA-F]{64}$')
EDIT: to clarify, can someone help with a regex pattern for this? "30[sig_size]02[r_size]02[s_size][sighash]"
Not all signatures in the Bitcoin block chain are DER.
OP_RETURNprefixed outputs don't have to be 80 bytes. – Anonymous – 2015-11-26T10:13:31.8771[quote]Not all signatures in the Bitcoin block chain are DER.[/quote] Is there an example for non-DER signature in blockchain? – amaclin – 2015-11-26T13:51:26.920
@Bitcoin I'm looking to recognise just DER signatures with regex, so I'm open to any suggestions for the regex pattern. Re OP_RETURN, isn't the regex saying between 0 and 80? (Honest question) – Wizard Of Ozzie – 2015-11-26T15:13:55.027
2@amaclin All of the ones with unnecessary amounts of padding on the values are BER not DER. You must use the smallest possible encoding for DER rules. – Anonymous – 2015-11-26T15:28:19.460
Yes the regex is saying between one and eighty, but that's not a requirement of the data type. – Anonymous – 2015-11-27T01:19:19.567
WRT the OP_RETURN: You're missing the data push byte. The data doesn't start immediately after the return. Also, 80 bytes is standard, but not required by consensus. – Nick ODell – 2015-11-27T01:50:41.523
@NickODell Expecting a certain data push type is ill advised, push types are malleable and not always used consistently. 80 bytes you could use
0x50for a direct push,0x4c50forPUSHDATA1, or0x4d0050forPUSHDATA2, or0x4e00000050forPUSHDATA4. Regular expressions isn't ideal for this task, especially for BER signatures. – Anonymous – 2015-11-27T02:10:13.3071@amaclin Some altcoins also now contain the bytes to trigger the OpenSSL 32bit consensus failure. – Anonymous – 2015-11-27T02:12:41.793
@Bitcoin How did we reach discussing altcoins, DER/BER and OP_RETURN ahead of my question on regexs? Me:
I'm looking to recognise just DER signatures with regex, so I'm open to any suggestions for the regex pattern.seems pretty clear and independent of DER/BER classification – Wizard Of Ozzie – 2015-11-27T15:15:47.037