I've asked myself similar questions which lead to learning more about mnemonics
with Bitcoin, and like many other (not all) cryptocurrencies, which use BIP39 which is a specification based on a unique wordlist of 2048 words (zero-indexed from 0-2047) available in various languages (although I should note that you cannot translate words across supported languages as they don't hold the same index values across languages for a given mnemonic, in terms of one "conversion" method you might be implying).
- The purpose of BIP39 is to create a mnemonic that is human-readable
(easy to write, recite, read, etc..) compared to the underlying
binary data that the mnemonic represents which is machine-readable
(zeroes and ones, which can be represented in hex format or other
notation/base numbers) but is not easy to write/recite and thus BIP39
solves that problem, however, understanding what is happening
underneath the surface is key for dealing with mnemonics and taking
control of your crypto security.
About mnemonics: Underlying every mnemonic is a very large random number and a checksum that gets added to the end, which is why you cannot just choose your own 12 or 24 words, as a portion of the last word is computed based on this checksum (more on that below).
For example, in BIP39 the wordlist corresponds to the range of 2^11 power (i.e. 2048) and the reason why the mnemonic ends up being 12 or 24 words depends on the desired security in terms of bits, depending on whether it is 128 bits or 256 bits.
So when you click to create a BIP39-compliant wallet there should be a cryptographically-secure pseudo-random number generator (CSPRNG) such as the WorldWideWide Consortium (W3C) Cryptography API for web-based wallets (note: I am a contributor to that spec on Github) running locally that gathers various bits of randomness from your local device in a way that would make it difficult for an attacker to replicate or predict, and then those bits are used as the initial entropy (to create the mnemonic).
In other words, for a 24-word mnemonic, the device would produce 256
bits and would convert that string to a byte array and hash it using
SHA256, which results in another 256-bit output of which the leading
8 bits are taken and appended to the end of the initial entropy
making it 264-bits long.
If you divide 264 bits by 11-bits per word, the resulting 24 groups
of 11-bits each correspond to the same 11-bit index value of where
each word is on the list (where 0 = 00000000000 and 2047 =
11111111111). So with each word corresponding to an 11-bit number in
the resulting mnemonic. The words are just an easy way to find the
11-bit number, to recreate the initial entropy+checksum to derive the
wallet.
In terms of converting:
It is not recommended to choose your own words, firstly because that selection process could have some bias that would reduce the resulting security of the mnemonic compared to a randomly generated one, and secondly because the checksum must be computed deterministically (so if you cut your 24-words in half and used the first 12 in another wallet, there is a very low chance it would be checksum-compliant (1/16 chance), whereas choosing your own 24-words would be even more unlikely to be checksum complaint (1/256 chance).
- Therefore, it is best not to choose your own words or slice your
mnemonic, but rather use trusted software in a cold-storage
environment (offline) for maximum security.
Entropy: The same applies for choosing your own entropy, in terms of it being best to leave it to trusted software. Although, others in the industry have given some examples of flipping a coin 128 times and writing a 1 for every heads and 0 for every tails, which is a way to manually create 128-bits of entropy for a wallet (where the wallet would compute the 4-bit checksum) provided the coin flip process is random (i.e. no one-side weighted coin).
- However, many wallets don't allow you to paste entropy, and only
allow a mnemonic to be imported or to be generated on the fly.
There is a mnemonic converter tool (note: I am a contributor to it on Github) with advanced features that can accept entropy to create a mnemonic of various lengths but I wouldn't suggest this for someone that just started or even for those with years of experience unless you are fluent with the BIP39 specification, as well as BIP44 and BIP32, and all the countless ways that one tiny mistake can lead to an irreversible loss of funds.
Regarding the question of converting: While I am not sure exactly what you meant by "converting" assuming that you meant slicing a 24 mnemonic in half to use just the first 12 words, I have explained above why that isn't a viable option, unless the 12 words end up being checksum-compliant. (some software can scan through valid 24-word mnemonics to find one that is also checksum-complaint when using only the first 12-words, but the last 12 words would likely not be checksum-compliant as there would be a 1/256 chance for both the first 12 AND last 12 to be).
- Such mnemonics exist but are hard to find/create while maintaining
the desired level of security in terms of bits (reducing a mnemonic
by one word reduces its security by half, thus a 12-word mnemonic
doesn't have half the security of a 24-word mnemonic but instead has
security equal to just the square root of a 24-word mnemonic, where
2^128*2^128 == 2^256. Note: I have written a similar program that
finds valid reversible palindromic mnemonics that are checksum
compliant in terms of BIP39, but only for education purposes.
Regarding using same mnemonic in all wallets: If a wallet is compatible with BIP39, there is a good chance it is also compatible with BIP44 to support multiple-account/coins which means that so long as the correct derivation path has been added to support by the developers of that wallet, that cryptocurrency will be supported and can be used with your existing mnemonic (but not if they use a different wordlist, like Monero, for example, which is not BIP39 compatible). This is why I like to think of mnemonics as crypto vaults, not wallets (as one mnemonic can contain multiple cryptocurrencies accounts that each contains multiple wallets, see HD wallets to learn more about that based on BIP32).
- While having all your crypto assets consolidated on one mnemonic
could be convenient, it also concentrates the risk in terms of the
custody responsible to manage that mnemonic, whereas, having multiple
mnemonics with multiple cryptocurrencies across each adds the burden
of more crypto vaults to manage but reduces the risk in case one is
compromised (I would argue most savvy crypto investors have multiple,
as one or more could be used for hot-wallets that are on
internet-connected devices, whereas, other mnemonics used for
cold-storage are on hardware wallets or devices that never connect
online.
Takeaway: it is best to use the software provided by the underlying token/cryptocurrency project as some projects use a different wordlist, and/or other differences that could render the mnemonic incompatible and lead to a permanent loss of funds, or other inconveniences such as if the same mnemonic could derive different addresses that don't match across software even with the same derivation path chosen if the root seed or other steps diverge for unknown reasons (and such troubleshooting is very technical to find a potential bug).
Example technical data for a randomly-generated mnemonic (don't use this with real funds):
Initial entropy of 128 bits in hex format: 659b8a03bfbb80cdcdc3c383d4b0d505
bytearray(b'e\x9b\x8a\x03\xbf\xbb\x80\xcd\xcd\xc3\xc3\x83\xd4\xb0\xd5\x05') <--- Entropy as bytes
c993b627272ef0cbc683ce275cf47ff82f73403ece8155bdd92c2dca2d86e3b1 <--- SHA-256 hash digest of entropy bytes, in hex format
c <--- Partial fragment of initial "byte" of hash representing first 4 bits.
c <--- First n bits of hash to convert to hex
1100 <--- Checksum (hex to bits)
Initial entropy + checksum = total bits: 011001011001101110001010000000111011111110111011100000001100110111001101110000111100001110000011110101001011000011010101000001011100
Length of total bits: 132 divided into 12 groups of 11 bits
['01100101100', '11011100010', '10000000111', '01111111011', '10111000000', '01100110111', '00110111000', '01111000011', '10000011110', '10100101100', '00110101010', '00001011100']
Corresponding index values for each group (in base 10):
[812, 1762, 1031, 1019, 1472, 823, 440, 963, 1054, 1324, 426, 92]
Corresponding mnemonic:
grain sword liberty legal retreat group damage journey long pitch crystal argue
You'll find the technical details how those words map to a key in the BIP39 spec. It's in theory possible that there are two sentences that generate the same key but being able to find such a (HMAC-SHA512) collision would be a huge issue for cryptography in general (https://crypto.stackexchange.com/questions/3049/are-there-any-known-collisions-for-the-sha-1-2-family-of-hash-functions) Though you might earn something similar to a nobel price if you find a way :)
– zapl – 2019-07-31T10:42:00.143