2 part missive. part 1: coinbase hacks. part 2: tiny llm speed softtuning draft Coinbase Hacks It turns out that if you have a hold on funds on your account in coinbase, if you raise your USD balance to match the hold e.g. by closing orders in usdc, than you can transfer amounts higher than this threshold. After the transfer you can then reopen the orders at the same levels. --- Tiny LLM Speed Softtuning Draft I tried to draft a start around adapting existing LLMs to produce output tokens in parallel rather than serially, kind of. Like daydreaming (when we daydream we reconsider all parts as they seem interesting). This computer is dying and not charging likely due to a frayed charging cable. The current draft is accessible via the arweave id in the json below. I'm sending this now rather than a more put-together post for that reason. The purpose of the work is to make embedded decentralized use workable. For example, if completed, this would let the Petals network generate an entire paragraph in under a second. Usually it takes under a second to generate a single token. It was more relaxing than expected to work on this. Note: The goal of adapting language models to produce their entire context in a single pass is definitely doable, but the approach below of a single token softprompt is very simplified and unlikely to fully converge. The reason this approach is more doable for me is it keeps things very similar to existing work. The context is extended with dummy embeddings for generating the extra tokens. Language models already produce dense parallel output, it is simply conventional to only use the last token. Ways to make it more powerful: - Insert more sequence embeddings before the output is collected. This lets it perform more computation. - Remove the causal mask. This lets information flow in both directions for twice as much computation, but involves modifying the inference architecture or library code a tiny bit. - Rather than training a single embedding, train one for each position. You can even train one for each underlying absolute position embedding, as well as one for each relative position, and sum them. - Alternatively, use e.g. LoRA or finetuning instead of soft prompts. - Train the softprompt, adapter, or finetuning for use in a recurrent manner and let it spend a few passes adjusting the output. (You can also add another head to train e.g. a confidence % to decide whether it is done recurring). {"id": "pZqfAnMEZxmIeGNFRwvmVBm216uoDBa6S8ALscl2NkE", "timestamp": 1708380349670, "version": "1.0.0", "public": "pGFsvdSB9sxbwU5L4HD2v12DK40kzZ5N69s6WlI3Uw9pFdHMHei3n1Tv4jvZqU9yeIMGsS60MQRvfJK1AEoNYsQqk4Rciajw0_IemZdwlt4u4voDALRalrQ3NV4knOlHRY11anqV0fNhikWCsiRPukIRZrdcFfqzFr0boH8bou7DgESNvWxROOxSC149oKxJ06FQsBDaIeElBsR8qTddybvXqMagXCM9y_HNrtAoz_8LgPjQtK5LFEbXhh9PyI_GOuoHyzJUc9Sm-V9kCB4kTm-SHrPbETQnvejZBcqEHxNcDNWBv6CWjj3-0V3dFMhjM1cy14d0Lm4j0IyRLm9bHM3s0ssVDd20gjWyar-D0o6guJIrteEC7UGR-w1yvXoGuIwdfZeoSAZ_CU9FrOJfQCTDs2aLgdCNeYKXg0Rt8YZL_elZnG7utCkO78TwxbGqear_I-1dlO39CUlo13YSS6pPonioWqkzXcXh93G7BYjgUxcPJ31kLyr2wBRA4OObAYRvh-5V3TkULlmwR4Q0pV3cUeOLI94b4WhaDZDI_RIJiCXQvtGy190NqTBeVogPrrAXLFkK0E013GByHrmzZoELfSUorjK-bDk4wXxdbVqzY7KXP-NEt3Bu-woinbUf56i3DXLrYlwINYK39VUydGpcQLZ5EDCL4u_IL_iFPt0", "signature": "aNI35ye1nZdMd33s5InHfTnqTI-Vv34fH6OPS8zpsTyvXZWorigoWQCDUzpfD-1GY_ul2L9zlu3_ZpejxUTYX4YGSimQtQ1kDA1lHvClZa-5yP74WLGTygL6ouEzbEZszi6I7bs-m15BKTpFuSOPj3go16n1uQAlV1QfKlzbDfxHZ3U5GesOZgoaXVjELfwA7YK-a6y8ojcXRkIASGaBF22BNztwKAZAZr4WHfAUIUj98l8I6m3riCXTWN6eO3LOHshVF3JF1C1CI8B3jgZYBI0CEXjewDGeHAo60vCPM1qPxE1XWT3rKZmFK0Sz6K5enQtuBZ1AiOFxMPu7P8HtVa5ITYEGyQf1lTiSmTCIvn10KtxXI-rpqCvtF-12BvzbVaPMFBFh3Lzr-h9KqMZXvJYloVBVfDwJEd5f_HBdwqytlyB8KDIJDB9TphzUlfN8XUzDeMVpgaSv1-OfjyXEVj0L6LcPaKR1S6fu7q1Q6vArhNGMrmC81tohN-BtbIqeX3BfRWjp8ZZhAuJHQtm8KwQOemiVEJmFSSwfoChsKLwLgnC0uU_nzPIewXKfDKvCmWy-IeixS4erNZM1cKdCgMAHW0bE7ucNuA1KlS3EZbTpeUl6yB_YmzZeOfyhLmL67vedoZyDEk_6_DF6Y6g_QV21E0NuBmW0qTszE0o8jew", "deadlineHeight": 1371451, "block": 1371451, "validatorSignatures": [], "name": "2024-02-17.tar", "size": 188262400, "start_height": 1367451, "start_block": "Pv5rjj2iLAhtdiCnBx3GyiJzZxob6hL6M2Kxke_LS5JBAbHtbts-iJ6BzgjhBtx0", "depth": 3}