[ot][spam][random][crazy][random][crazy]
do you want to know the answer to every question? I HAVE IT. It is a process that produces answers for questions. There are many names for this answer. Some call it a human brain, others call it google, others call it a smart friend!
How could we write down this answer to all questions. Uhhh pseudocode? how would on erepresent a question? A question is like a semilogical statement that is missing a component, with direction to fill in the component. okay maybe i can draft that components for building answer to all questions 1. representation of a question: [semilogical statement missing a component, with direction to fill the component in?] 2. ...
components for building answer to all questions 1. representation of a question: [semilogical statement missing a component, with direction to fill the component in?] 2. ... 2 ... uhh what is needed to find an answer? maybe a knowledge store, and an environment? i'm thinking of maybe two parts: retrieving the answer from the knowledge store, and expanding the knowledge store using the environment.
- representation of questions - storage of knowledge - querying the knowledge for answers to the question - interaction with environment - expanding the knowledge store via exploration of the environment - doing so in a relevent, productive way that last bit, engaging the environment _effectively_ to expand the knowledge store, seems of interest. how to engage it effectively?
--- dialog A: "Do you want to be a human being?" B: "A human being?" A: "Some of us are trying to be a human being. Just for fun!" B: "Is it what to do?" A: "Do you want to do it?" B: "If it's the right thing to do, I probably want to do it!"
------------------- who is the right person to vote for? answering this question means learning about the actual behaviors of all the people. or, more accurately, the likelihood distributions of their behaviors when in office.
unless your votes are rigged! and who you vote for is more about what category of oppressive control you should be put in after your votes are fed into a computer and then thrown out.
-------- fun factoid! i believe the first major public appearance of useful tempest security was when van eck phreaking was demonstrated to convince a government not to use digital voting machines. van eck phreaking is the ability to make a copy image of somebody's display from a distance, via a radio that reads its emissions. this works for most displays. maybe i can work on it a little! i don't have any sdrs around atm. maybe it would be helpful to store some test data. when you do van eck phreaking, basically you're interested in signals that are at the pixel clock of your display, and multiples of that number at the higher harmonics.
well this is new on the wikipedia page on van eck phreaking. i know there are a lot of fancy spy devices like this, but batteries can get one's nose itching so much ===Tailored access batteries=== A tailored access battery is a special laptop battery with Van Eck Phreaking electronics and power-side band encryption cracking electronics built-into its casing, in combination with a remote transmitter/receiver. This allows for quick installation and removal of a spying device by simply switching the battery.<ref>White paper, FDES institute, 1996, page 12.</ref>
----- anyway let's find a puzzle similar to the van eck freaking puzzle! what's some simple puzzle like that. i want to extract a signal from noise.
hummmmmm how about audio! i'm on a laptop atm. its fan is running. maybe i can find data on the fan from the microphone.
HOBBY PENTESTING! Welcome, this is a hobby pentesting and defense list, focused on cryptography.
So, those of us who can still gurgle through the maimings, are quite curious what the problem is with HOBBY PENTESTING and why it gets you MURDERED.
We are kind of wondering if our EXPERIENCE WITH HOBBY PENTESTING could be useful in PREVENTING THE MURDER OF HOBBY PENTESTERS.
yup yup meanwhile i thought it would be fun to use techniques from last millenium to extract signals i already own from microphone recordings of my computer fan.
when i turn the volume up in gnome i hear a blip from speakers though! i fixed this computer by hand last month. it had been so broken for so long. big accomplishment. i swapped in three different mainboards and it now has a nordic keyboard which is mapped to my own country's locale. turns out my memory is stronger than my eyes re typing.
[the speakers didn't function before] although strangely, the original mainboard ended up working, i'm still using the same one and returned the replacements used for troubleshooting. either it just needed cables reseated (some likelihood here) or it was just disabled somehow and worked fine
i'll look at running processes to see if any have to do with audio ps -Af f | less /audio user 4697 1 0 Nov06 ? S<l 0:07 /usr/bin/pulseaudio --start --log-target=syslog i've got pulseaudio running!
i checked out 4697's open filehandles and i think my eyes just missed the /dev/snd folder somehow when i was looking. i miss things a lot when looking for them.
ooooh i remember i have experience with a general recording library! i made an audiologger with it! [to write to blockchains!]
in the outer context, i was working on stabilising an android phone i'm jailbreaking, so as to restore from past images more easily i also wanted to clean up from this desk which is not mine
miniaudio has an example for _playing_ rather than recording ... ohh the documentaiton is in the header file itself :D
i want to do either a fourier transform or a self convolution of audio data! maybe self convolution, not sure
i like c++ but i don't know how to do accelerated math in it :S maybe i'll use python and torch
i want to imagine summing a signal from each rotation of the fan, to magnify its details and average out its noise! i've been thinking of this for about a decade. uhhhhhh i guess that would be BASIC MATH?
but maybe my issues think of it as more FORENSIC MATH. did you know digital forensics used to be a major topic of this list?
HOBBY DIGITAL FORENSICS HOBBY PENTESTING probably professional pentesting and forensics too; everything was hobby in the 90s.
so what you might call HOBBY FORENSIC MATH, I think of as kind of HIGH SCHOOL ALGEBRA? I don't have math training, myself, much.
REVEALING ADVANCED FORENSIC TECHNOLOGY :D via horrific slavery of poorly-labeled civilians :S
i'm looking forward to exercising my RIGHT TO IMPLEMENT ADVANCED RESEARCH PAPERS again.
this list, and others used to be the cutting edge no, it wasn't researchers or professionals, i think it was mostly hobbyists.
some things still are the cutting edge many of those hobbyists became researchers and professionals
miniaudio is outputting all zeros for my microphone i've only tried really making it work before on android phones. $ ./capture | hexdump -C 00000000 52 49 46 46 1c 00 00 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt | 00000010 10 00 00 00 01 00 01 00 80 bb 00 00 00 77 01 00 |.............w..| 00000020 02 00 10 00 64 61 74 61 00 00 00 00 00 00 00 00 |....data........| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| *
pip3 install pyaudio import pyaudio audio = pyaudio.PyAudio() # ! the link on reddit referenced a text2speech project from 6 years ago, to show how it works (jasper)
import torch import pyaudio class AudioInput: def __init__(self): self.audio = pyaudio.PyAudio() def stream(self, frames_at_once): stream = self.audio.open( format=pyaudio.paFloat32, channels=1, rate=48000, input=True, frames_per_buffer=frames_at_once ) try: while True: data = stream.read(frames_at_once) tensor = torch.frombuffer(data, dtype=torch.float) yield tensor finally: stream.stop_stream() stream.close() def __del__(self): self.audio.terminate() if __name__ == '__main__': for chunk in AudioInput().stream(1024): print(chunk)
ok i don't remember what convolution is uhhh i want to find stuff that's more maybe adding would be nice? some function that torch already has would seem nice, dunno how confusing!
https://en.wikipedia.org/wiki/Convolution_theorem
convolution in one domain (e.g., time domain) equals point-wise multiplication in the other domain (e.g., frequency domain)
so the self convolution might be findable via the square of the fourier transform, dunno !
one idea is, what kind of information am i looking for here? what is in a fan? i have no idea. maybe i should just take the fourier transform and look for a peak thne i can skip to the greater challenge of visualization!
what would i like it to look like? maybe a long-running single fourier plot (a line graph) that updates as it accumulates more data, becoming smoother?
found https://vispy.org/
VisPy is a high-performance interactive 2D/3D data visualization library leveraging the computational power of modern Graphics Processing Units (GPUs) through the OpenGL library to display very large datasets.
dunno. at least somebody made a library.
ideally a gpu-accelerated plotting lib would let you pass gpu buffers as data (which i actually have here) but that is likely not the situation here.
dlpack is the structure used for cross-hardware vectors: https://dmlc.github.io/dlpack/latest/ i wonder if vispy can load it yet. if they haven't heard of it, i could open an issue
i opened an issue but it looks like it actually supports gpu buffers already via numpy, which itself already supports dlpack there's no need for them to mention it ;p i am again silly in public
so far it does actually look like it does only cpu buffers hum i made a rotating square following a tutorial! been years since i did that
i did about half the work needed to implement direct gpu data access in vispy. unfortunately i did not document my work well. it took longer than i would have liked. takes a long time to sor through and fix my dense cognitive mistakes. https://github.com/vispy/vispy/pull/2429
-------- back to fun signals and high school algebra during my psychosis i.e. one big long extended flashback, i often daydream of how you might upsample a signal from an extended recording, because i am trying to try it. i poked at a fourier transform by hand, without using visuals, again, and i am reminded that the indices in the fourier transform are the frequency, not the period, which means you get more fine detail at high frequencies when you take the fourier of lots of data. anyway, so, i guess this is obvious already, but i wonder if that means you could take a long recording of a repeating signal, and then take only the higher region of the fourier transform output, and then apply the ifft to that, to get an upsampled version. i'm guessing that this would roughly work, and the only reason i haven't done it in the past is because i didn't have enough computer ram to do a fourier that large, and wanted to understand it to take it apart manually. i'd like to test it! i'm thinking of generating some kind of high quality signal, and then repeating it, and sampling it at a lower and non-integral-multiple rate of the generated data, taking the fourier transform, and then reconstructing it. this may engage more triggers than i expect, but it sounds fun!
i'm thinking it could be nice to construct the data without repeating it this basically means indexing a single long generated example using modular arithmetic
original_signal = np.random.random(1024*1024)
trigger #2: due to the number of things i am doing here, the output is going to be imperfect and fail the first few tests i run. this is simply because this is a lot of things for me to hold and plan at once.
period_length = len(clear_signal) / 1024 * np.random.random()
i'd like to sample original_signal such that it is period_length long, in one big long stretch that ends up being the same original length ...
ok so i can imagine that a time clock is ticking for every sample, by 1 idx = time = idx = time uhhhh so in the first period, the sample would be original_signal[int(idx * len(original_signal) / period_length)] ... i think?
int(0 * len(original_signal) / period_length) 0 int(1 * len(original_signal) / period_length) 6195 int(2 * len(original_signal) / period_length) 12391
yeah, that looks right so, um ... i would just take the modulo of that number with the length. that could happen anywhere in the index formula. maybe i'll make a .py file!
trigger #3 (#4?): might relate to switching contexts while considering parts of fft apis and fourier data, don't quite remember. might relate to amount of time spent coding something while holding a goal, not sure. import numpy as np signal_bufsize = 1024 * 1024 recording_bufsize = signal_bufsize original_signal = np.random.random(signal_bufsize) recording = np.empty(recording_bufsize) period_length = len(clear_signal) // recording_bufsize ** 0.5 * np.random.random() for out_idx in range(recording_bufsize): # using a loop because i am confused in_idx = int(out_idx * signal_bufsize / period_length) % signal_bufsize recording[out_idx] = original_signal[in_idx] # now take fft with np.fft.rfft # then select only upper portion of 1 + bandwidth equal to period_length # then reconstruct using irfft [curious: is this complex output? what do the angles look like?] # then downsample original_signal to the bandwidth using averaging # then compare real values # NOTE: failure is expected AT FIRST. then, identify the mistake, and improve it.
this must have been the issue: # then select only upper portion of 1 + bandwidth equal to period_length ok, uhhhhhhhhhhhh so what is "bandwidth equal to period_length" well, period_length says the period in recording samples that the wave is. meanwhile, fft indices are frequencies in units of 1/bufsize, roughly. so let's think about hte frequencies, roughly. frequency 0 is DC offset. that means its period is infinite. (bufsize / 0) frequency 1 is the first half vs the last half, its period is the entire buffer. (bufsize / 1) the last frequency, bufsize-1, is the nyquist frequency. its period is 2. now what is bufsize? the parameter of interest is the recording buffer size [continue]
[..continue] so if recording_bufsize is the length of the data passed in, then recording_bufsize // 2 + 1 is the output of ftt.rfft . it removes one side of the data. then we have rfft_bufsize = recording_bufsize // 2 so at rfft index 0, this represents a period of infinity at rfft index 1, this represents a period of recording_bufsize at rfft index 2, i suspect this would be a period of recording_bufsize / 2 and at the last rfft index, i.e. recording_bufsize // 2 + 1 - 1, this would be a period of 2 so there's a quotient relationship between recording_bufsize, and the expression recording_bufsize // 2, that yields 2 . one is the numerator, and one is the denominator recording_bufsize / (recording_bufsize / 2) = 2 that works for the others too recording_bufsize / 0 = infinity recording_bufsize / 1 = recording_bufsize recording_bufsize / 2 = recording_bufisze / 2 so I think you divide the original buffer size, by the rfft index, and this simple relation gets the exact period, still not sure. and the reason half is missing in that calculation ... is not stable in me but not super important here ... i think it's because the lowest frequency is the nyquist, which is 2. so, i guess i'll checking: recording_bufsize / rfft_index = equivalent period in indices
i figure i'll put it in pdb and poke at it to verify it recording_fft = np.fft.rfft(recording) # now take fft with np.fft.rfft # then select only upper portion of 1 + bandwidth equal to period_length # note: am guessing that period_in_recording = recording_bufsize / rfft_idx # so, solving for rfft_idx # rfft_idx = recording_bufsize / period_in_recording # and, if right, that would be the index in the rfft output of the period's peak # so further data on it would be to the right, at higher frequencies import pdb; pdb.set_trace() fft_subregion = recording_fft[int(recording_bufsize / period_length)]
so i ran it in pdb. i had to fix a couple typo-y things. the output of the sequence, when passed back into irfft, is almost as long as the original signal, but the data is quite different. that's the test failure that is expected. so, it was an accomplishment to get this far, to get all those parts in, and now we get to make a test pass, showing it works. the failure of the output data matching the input likely relates to both phase offset in the output and the density of the signal content at higher frequencies than are measured. we can reduce the high frequency content by changing how the example data is generated, or we can sample it better by greatly increasing the recoridng buffer size and reducing the original signal buffer size a bigger challenge for me is how to compare two signals that have an unknown phase offset. an alternative would be to calculate what the phase offset is, it may be the average of all the phase offsets of all the placed test waves. this has more steps for me to make a mistake in. normally, without a gfx-visualisation inhibition, i would just plot it. the human brain is really good at comparing phase offsets. - i'm thinking it would make sense to try taking the fft of both waves, and compare their lower frequencies. this is simply enough, and likely enough to work, to be worth trying.
maybe what i'll do is take the fft of both waves, take the absolute value of both ffts to remove phase information, then regenerate them at the same buffer size, and compare the values of the data. that is expected to have a 40% chance of working, with most of the 60% error space occupied by small logical errors, not a process error. it is of course still possible that the process is faulty. [and i am quite delusional! so!] if that works, which it should be makeable to, then i'd like to drop the original signal sample size sufficiently so as to regenerate it exactly. then it would be fun to plot it! maybe i could imagine i'm reconstructing my fan data or something.
period_fft_idx = recording_bufsize / period_length fft_subregion = recording_fft[int(recording_bufsize / period_length):] (Pdb) p np.fft.irfft(fft_subregion).shape (1038584,) (Pdb) p original_signal.shape (1048576,)
here's what i have now. my plan is to pdb through it and see how the two reconstructed things compare. i feel like they won't compare well. part of this feeling is backed by a personal experience regarding frequency domain mutations and sample-to-sample comparisons. another part is my personal experience of how densely and thoroughly i make mistakes. to really ensure this passes, i should simplify the data so much that it is obvious that it will, verify that it does, make it pass if it doesn't, and then make it more complex and see what stimulates inaccuracy. import pdb; pdb.set_trace() period_fft_idx = recording_bufsize / period_length fft_subregion = np.concatenate(([0],recording_fft[int(period_fft_idx):])) # then reconstruct using irfft [curious: is this complex output? what do the angles look like?] # then downsample original_signal to the bandwidth using averaging # then compare real values # NOTE: failure is expected AT FIRST. then, identify the mistake, and improve it. reconstructed_fft = fft_subregion original_fft = np.ftt.rfft(original_signal) reconstructed_rfftsize = min(len(reconstructed_fft), len(original_fft)) // 2 reconstructed_from_reconstructed = np.fft.irfft(reconstructed_fft[:reconstructed_rfftsize].abs()) reconstructed_from_original = np.fft.irfft(original_fft[:original_fft].abs())
this got me confused. fix is simple, gotta step back and prepare some habits. Traceback (most recent call last): File "/shared/src/scratch/upsample_example_looking_for_test.py", line 33, in <module> reconstructed_from_reconstructed = np.fft.irfft(reconstructed_fft[:reconstructed_rfftsize].abs()) AttributeError: 'numpy.ndarray' object has no attribute 'abs'
i am partway through addressing the imperfections. i got the data much closer by prefixing the subrange with the DC offset. obviously, it would never have matched without that. there are further issues that also have that property, that i haven't thought of. before this change, the data compared like this: (Pdb) p reconstructed_from_original array([263.0608397 , 0.95268138, 0.78292761, ..., 1.21166631, 0.78292761, 0.95268138]) (Pdb) p reconstructed_from_reconstructed array([ 2.53569771e+02, 6.97125806e-02, -2.41580764e-01, ..., -1.75374096e-01, -2.41580764e-01, 6.97125806e-02]) after the change, it now compares like this: (Pdb) p reconstructed_from_reconstructed array([245.42651702, 0.86299853, 1.39828105, ..., 0.86886888, 1.39828105, 0.86299853]) (Pdb) p reconstructed_from_original array([262.83951251, 1.20706425, 0.73003879, ..., 0.84832615, 0.73003879, 1.20706425]) the shapes are roughly the same, and the orders of magnitude of every sample are all the same. the data is still significantly distant, and it's notable that removing the frequency information collected a ton of energy at t=0, removing the original shape of the signal.
the big trigger here for me i guess relates to time spent on the task, and utility of the immediate result - although this would of course be much easier with visualization, a good thing for me to learn here is that i can have success more easily if i start with data so simple, that it actually passes for some other reason. data i understand the structure of well enough to verify what is going on.
oh i can also remove high frequency components from the test signal but i think jumping straight to a square wave would help
here's current file. i think i'd like to change the test data to be a low frequency square wave. then it is easy to debug by inspection; and it may already work fine, the issues stemming only from high-magnitude high-frequency data that is not recorded for long enough. import numpy as np signal_bufsize = 1024 * 1024 recording_bufsize = signal_bufsize original_signal = np.random.random(signal_bufsize) recording = np.empty(recording_bufsize) period_length = len(original_signal) // recording_bufsize ** 0.5 * np.random.random() for out_idx in range(recording_bufsize): # using a loop because i am confused in_idx = int(out_idx * signal_bufsize / period_length) % signal_bufsize recording[out_idx] = original_signal[in_idx] recording_fft = np.fft.rfft(recording) # now take fft with np.fft.rfft # then select only upper portion of 1 + bandwidth equal to period_length # note: am guessing that period_in_recording = recording_bufsize / rfft_idx # so, solving for rfft_idx # rfft_idx = recording_bufsize / period_in_recording # and, if right, that would be the index in the rfft output of the period's peak # so further data on it would be to the right, at higher frequencies import pdb; pdb.set_trace() period_fft_idx = recording_bufsize / period_length fft_subregion = np.concatenate((recording_fft[:1],recording_fft[int(period_fft_idx):])) # then reconstruct using irfft [curious: is this complex output? what do the angles look like?] # then downsample original_signal to the bandwidth using averaging # then compare real values # NOTE: failure is expected AT FIRST. then, identify the mistake, and improve it. reconstructed_fft = fft_subregion original_fft = np.fft.rfft(original_signal) reconstructed_rfftsize = min(len(reconstructed_fft), len(original_fft)) // 2 reconstructed_from_reconstructed = np.fft.irfft(abs(reconstructed_fft[:reconstructed_rfftsize])) reconstructed_from_original = np.fft.irfft(abs(original_fft[:reconstructed_rfftsize]))
re the [] question in the protodraft, the output is not complex because an rfft half-fourier contains no complex information; it would need a differing negative region to be complex
daydreaming and realising i need to either adjust the indices of the frequencies, or cut at a power of two, so that they still represent the same relative periods
uhhh let’s see i have 1/0, 1/1, 1/2, 1/3 … and i want to reproduce that for n-x i was thinking power of 2 won’t work … could i just pad the right end? basically writing it out seems helpful engaging inhibition :S
then i guess i would frequency shift everything … i wonder what would happen if i did a complex fft and frequency shifted the data so that the signal occupied the entire buffer
of course that didn't work on its own fun thing to include when relearning these things though square wave seems to be what makes sense. going a measure farther than simply lowpassing the data and increasing the recording size seems important, since it's so hard to work on things like this.
square wave on queue hi karl, did you come here from a daydream idea? could you change the data to a square wave so it is easier for me to troubleshoot the algorithm?
#original_signal = np.random.random(signal_bufsize) original_signal = np.zeros(signal_bufsize, dtype=float) square_rise_idx = np.random.randint(signal_bufsize) original_signal[square_rise_idx:] = 1.
ok, so the next step could be to use pdb to look at the fft and ifft of the result. comparing combinations with subregions and shifts will eventually show how to make it work
this will be easier if you can quickly plot them. matplotlib is likely efficient here.
# DISPLAY=:0 import matplotlib.pyplot as plt plt.subplot(211) # 2 rows 1 cols plot #1 plt.plot([1,2,3]) plt.plot([2,2,2]) plt.subplot(212) # 2 rows 1 cols plot #2 plt.plot([6,5,4]) plt.show()
i'm thinking about how to simplify this, when seeing a failure has such huge feedback a really clear thing would be setting the lower frequency regions of the original fft to 0. then, simply performing the ifft on the result, will make something similar to the original signal. data will be missing. this original signal can be upsampled by simply processing the sines and cosines at higher density but a bunch of information will be lost, because it has been turned into moireish stuff at the lower frequencies. ideally, when we took the fft, we would do it in a way that assumed the actual sampling we were performing ...
that is, one wants the entire signal reconstructible, using _only_ the higher frequency sinusoids ...
for the test, then, we could sample the signal at its original definition, in the resampled data
freq[k] = sum_over_n(time[n] * exp(-2 * pi * i * k * n / N) time[n] = 1/N * sum_over_k(freq[k] * exp(2 * pi * i * k * n / N) yay sums could be transformed to array operations
trying a few tihngs, it seems like changing the sign of the exponent changes the sign of the imaginary component (the sign of the sine)
freq[k] = sum_over_n(time[n] * exp(-2 * pi * i * k * n / N) time[n] = 1/N * sum_over_k(freq[k] * exp(2 * pi * i * k * n / N) so, here, the angle is k * n / N, where k is the index in the frequency domain. could i still reconstruct a signal if i scaled k * n / N to be within a small bandwidth ??? i'm guessing that i could, if my data is all within that bandwidth, and sampled it the same way it was generated.
additionally, stating the fft this way means it can be performed in an accumulating manner, without requiring large ram :)
here is my draft. to test it, i should give a full range and verify it's the same as the fft. def micro_ft(time_data, max_period): N = time_data.shape[-1] # for the highest frequencies, we can still probably do k * n / N # but rather than starting at k = 0, we'll want to start at the minimum, and have # fractional k's prior to that. # max_period is number of samples # k is calculable from it min_k = N / max_period k = np.linspace(min_k / N, 1, N) n = np.arange(N) # freq[k] = sum_over_n(time[n] * exp(-2 * pi * i * k * n / N)) return time_data * np.exp(-2j * np.pi * k * n) def micro_ift(freq_data, max_period): N = time_data.shape[-1] min_k = N / max_period k = np.linspace(min_k / N, 1, N) n = np.arange(N) # time[n] = sum_over_k(freq[k] * exp(2 * pi * i * k * n / N)) return freq_data * exp(2j * np.pi * k * n)
i changed k: k = np.concatenate((np.zeros(1), np.linspace(min_k / N, 1, N - 1))) to include DC offset and added import pdb; pdb.set_trace() '''defined micro_ft(time_data, max_period) and micro_ift(freq_data, max_period)''' to help me remember what's up when in pdb
so, the output is not the same as np.fft.fft, but it does retain the data content: (Pdb) data = np.random.random(8) (Pdb) p micro_ift(micro_ft(data, 8), 8) array([0.06095369+0.00000000e+00j, 0.80908567+0.00000000e+00j, 0.63218561-2.77555756e-17j, 0.27129873+0.00000000e+00j, 0.14631817+0.00000000e+00j, 0.30941373+0.00000000e+00j, 0.21181399+0.00000000e+00j, 0.61963621+0.00000000e+00j]) (Pdb) p data array([0.06095369, 0.80908567, 0.63218561, 0.27129873, 0.14631817, 0.30941373, 0.21181399, 0.61963621]) it reconstructed it exactly. what a relief.
it also reconstructs period data :D (Pdb) quadhalfdata = np.concatenate([data[:4]] * 4) (Pdb) p micro_ift(micro_ft(quadhalfdata, 4), 4) array([0.06095369+0.00000000e+00j, 0.80908567+0.00000000e+00j, 0.63218561+0.00000000e+00j, 0.27129873-1.38777878e-17j, 0.06095369-3.46944695e-18j, 0.80908567+0.00000000e+00j, 0.63218561-5.55111512e-17j, 0.27129873+0.00000000e+00j, 0.06095369+0.00000000e+00j, 0.80908567+0.00000000e+00j, 0.63218561+0.00000000e+00j, 0.27129873+0.00000000e+00j, 0.06095369+0.00000000e+00j, 0.80908567+5.55111512e-17j, 0.63218561+0.00000000e+00j, 0.27129873+0.00000000e+00j]) (Pdb) p quadhalfdata array([0.06095369, 0.80908567, 0.63218561, 0.27129873, 0.06095369, 0.80908567, 0.63218561, 0.27129873, 0.06095369, 0.80908567, 0.63218561, 0.27129873, 0.06095369, 0.80908567, 0.63218561, 0.27129873]) but of course it does this at the original resolution; i'd need to process the data differently to upsample it
to upsample, I'm trying changing n, the time parameter, to be a linspace instead of an arange: n = np.linspace(0, max_period, N)
the details on sample 0 are all to the right of it whereas the details on the last sample are all to the left
(Pdb) p data array([0.24436139, 0.0199289 , 0.47162206, 0.0475059 ]) (Pdb) p micro_ift(micro_ft(quaddata, 4), 4, upsample=True) array([ 0.24436139+0.00000000e+00j, 0.00810581-1.82059549e-02j, -0.44397862-1.59092263e-01j, 0.01057106+4.63148326e-02j, 0.06852563-2.34556449e-01j, -0.00587415+1.90435136e-02j, -0.08421137-4.64042897e-01j, 0.0433988 +1.93223922e-02j, -0.1221807 +2.11623172e-01j, -0.01974854-2.67512115e-03j, -0.32078447-3.45723432e-01j, -0.02497308-4.04123262e-02j, -0.18402159-1.60774822e-01j, -0.01987317+1.48928846e-03j, -0.09805574+4.61315984e-01j, 0.0475059 +4.65541255e-16j]) the first sample and last sample match, but all the middle are interpolated. here, N=16, and max_period=4 . i'd like it if there were 3 interpolated samples after each matching sample. so, I would want a linspace that, instead of (0, 4, 16), is more (0, 5, 17)[:-1]
np.linspace(0,4,16) array([0. , 0.26666667, 0.53333333, 0.8 , 1.06666667, 1.33333333, 1.6 , 1.86666667, 2.13333333, 2.4 , 2.66666667, 2.93333333, 3.2 , 3.46666667, 3.73333333, 4. ]) np.linspace(0,5,17)[:-1] array([0. , 0.3125, 0.625 , 0.9375, 1.25 , 1.5625, 1.875 , 2.1875, 2.5 , 2.8125, 3.125 , 3.4375, 3.75 , 4.0625, 4.375 , 4.6875])
interestingly, it's still off by a constant phase, and i'm not sure why yet it's also interesting that the way the sinusoids sum its details are equal to its larger shape. i imagine there's a trigonometric identity for that. (Pdb) data = np.random.random(4) (Pdb) quaddata = np.concatenate([data]*4) (Pdb) p data array([0.16894141, 0.65128647, 0.69280144, 0.71307183]) (Pdb) p abs(micro_ift(micro_ft(quaddata, 4), 4, upsample=True)) array([0.16894141, 0.65128647, 0.69280144, 0.71307183, 0.16894141, 0.65128647, 0.69280144, 0.71307183, 0.16894141, 0.65128647, 0.69280144, 0.71307183, 0.16894141, 0.65128647, 0.69280144, 0.71307183]) (Pdb) p abs(micro_ift(micro_ft(quaddata, 4), 4, upsample=True))[::4] array([0.16894141, 0.16894141, 0.16894141, 0.16894141])
so i thought a litle about the meaning of the linspace, and the location of data in a period, and i realised i needed to just pass a period size one greater to the function, so that the tail of the last sample in the period was processed. it works, when doing that, with the simpler linspace: (Pdb) data = np.random.random(4) (Pdb) quaddata = np.concatenate([data]*4) (Pdb) p data array([0.40683351, 0.00094595, 0.74521147, 0.1613921 ]) (Pdb) p abs(micro_ift(micro_ft(quaddata, 5), 5, upsample=True)) array([0.40683351, 0.00094595, 0.74521147, 0.1613921 , 0.40683351, 0.00094595, 0.74521147, 0.1613921 , 0.40683351, 0.00094595, 0.74521147, 0.1613921 , 0.40683351, 0.00094595, 0.74521147, 0.1613921 ]) (Pdb) p abs(micro_ift(micro_ft(quaddata, 5), 5, upsample=True))[::5] array([0.40683351, 0.00094595, 0.74521147, 0.1613921 ]) i guess i'd like to figure out what meaning of N is needed to move that +1 into the body of the function
maybe what's more useful is trying it with sampling error and seeing if it can still recover it i can plan my sampling error so that it should certainly work if done right
a simple sampling error would be a period that is off by a half sample. in that situation, the wave would look like one of twice the period if the wave is at twice the sampling rate with data = np.random.random(8) then when recorded at half its resolution it will look like np.concatenate((data[::2],data[1::2])) or such .
the period in the recording will then be half the waveform data length, and we can add 1 or 0.5 to that, one of those may work
i tried with a max_period of 2.5 and it reconstructed the 4-long undersampled data with the two middle samples swapped: (Pdb) p abs(micro_ift(micro_ft(poorlysampled, 2.5), 2.5, upsample=True)) array([0.40683351, 0.74521147, 0.00094595, 0.1613921 ]) (Pdb) superdata = np.random.random(4) (Pdb) poorlysampled = np.concatenate((superdata[::2],superdata[1::2])) (Pdb) p abs(micro_ift(micro_ft(poorlysampled, 2.5), 2.5, upsample=True)) array([0.20217204, 0.32155652, 0.52211907, 0.16352045]) (Pdb) p superdata array([0.20217204, 0.52211907, 0.32155652, 0.16352045])
the first pdb line in previous, i was processing the wrong data by accident, and copied it by accident i get the same output when i used 3 instead of 2.5 that's interesting !!!
i'm thinking here that when i generate the data, i'm breaking the assumption of value locality in time. i'm passing random values that have no local correlation. then, its effectively using linear combinations of wavelets equal to the entire signal, and trying to interpolate between these samples -- i'm probably juts getting the wrong parts of these wavelets, because the data has no locality i confused myself a little around this idea
but basically, when i talk about sampling error, and it being off by a half waveform, it's technically not actually the concatenation of the two halves separately sampled, unless it is made of summed square waves, such that slowly shifting keeps the same value. the math assumes it is made of sinusoids. not square waves.
so i guess what i would do is actually sample it as if it is made of sinusoids in order to test for errors in my functions, i would want to do that with a different implementation than the sinusoid summing stuff i already wrote >_>
- i could take the fft of the signal. this turns it into sinusoids. - then i could phase shift it? adjust all the angles? for each sample? - and then do an ifft for each sample that might work. i would shift it so that sample 0 is at the location of each point i am sampling. a slow way to upsample. but another approach.
def phase_shift_complex(data, angle_shift): mags, angles = np.abs(data), np.angle(data) return mags * np.exp(1j * (angles + angle_shift))
basically i used the same fourier expression, but found it by trial rather than referring to anything, and verified it in pdb:
data array([0.17457665, 0.27853706, 0.92643594, 0.9938617 ]) [abs(((np.exp(2j * sample_idx * np.pi * np.fft.fftfreq(len(data)))) * np.fft.fft(data)).sum()) / len(data) for sample_idx in [0,1,2,3]]
# sample something made of sinusoids at funny offsets, using a different approach for testing def sample_sinusoids_funny(data, fractional_indices): angle_ratios = fractional_indices / data.shape[-1] return [ ((np.exp(2j * sample_idx * np.pi * np.fft.fftfreq(len(data)))) * np.fft.fft(data)).sum() / len(data) for sample_idx in fractional_indices] ] first i'll test it in function form, then try to redo the upsampling at half rate
ok uhhhh so if a waveform is N long and i want it twice sampled with the last sample offset by 1/2 in the second sampling, and 1/4 in the first, (or 1 in the second, and 0.5 in the first, in its own index scale) uhh so into index N-1, goes waveform index N-1 into index N/2, goes waveform index N-2 and the endpoint ... is ... ! at N/2+0.5 so max_period is N/2+0.5, such that N+1 is where the wave restarts twice. uhhhhhhh then for sampling i have it taking a list of indices, which will be the modulo of a linspace ... the linspace will go ... to ... the inverse of N+1 when the max_period is N/2+0.5 so recording_idx = waveform_idx * (N / 2 + 0.5) / N then waveform_idx = recording_idx * N / (N / 2 + 0.5) i think it works draft saved at 1:15 pm
ok the idxs are wrong recording idx 0 <==> waveform idx 0 recording idx 1 <==> waveform idx (1.0,3.0) ... recording idx N <==> waveform idx 4+(2.0,5.0) the wave is [0,1,2,3] the recording would be [0,2,1,3] with indices slowly shifting and modulo so like [0,2,5,7] this seems to happen with np.linspace(0,N*2-1,N,endpoint=True) just found by trial
ok, given that linspace, then it looks like waveform_idx = recording_idx * (N * 2 - 1) / (N - 1) so recording_idx = waveform_idx * (N - 1) / (N * 2 - 1) and the max_period is 4 * (4 - 1) / (4 * 2 - 1) = 12 / 7 = N * (N - 1) / (N * 2 - 1) draft saved at 1:34 pm -- i tried using the above to construct index offsets and pass them to sample_sinusoids_funny and then micro_*ft, upsample=True . the first and last sample are correct, but the middle two differ. i now have a system that should be closed, where the sinusoids in question are being directly reprocess. so, i should be able to examine them to find where the discontinuity is, that they aren't. i'm thinking the max_period parameter to micro_*ft could be being interpreted in a way different from how i am using it here. failing that, i can in theory always step through the parts and take notes.
ummmmm so max_period is turned into k in the micro_ functions., and passed to the exponent. looks like it's the angle advancement per sample for each sinusoid, so the frequency is proportional to its inverse. meanwhile, sample_idcs i believe are frequencies, as a portion of the total data length np.fft.fftfreq(4) returns [0,0.25,-.5,-.25] 0 is DC 0.25 then would be period = 4 -0.5 is then the nyquist frequency, period=2 so it looks like fftfreq expects frequency to be 1/period draft saved at 1:53 then if we have k in the exponent as 2j * pi * k * n, where n is the index what is the period in samples? well, a period goes by when the exponent is 2j * pi so that's when k * n == 1 . then, the whole thing is N long ... so when k == 1/N, that's an fftfreq .. of ... N? k == 1/N means the period is N. the fftfreq is 1/period, so yes. this means k == fftfreq ... I think? so a possible difference here is the difference in np.linspace(min_k / N, 1, N - 1) from arange(N) * (N * 2 - 1) / (N - 1) if a bisect/troubleshoot the issue by simply trying that arange directly, it may still fail for another reason, sadly. the other reason could be missing data in the transform. i know the current transform can reconstruct a wave; i don't know if the other will. maybe i can test it alone. draft saved a little late at 2:00p oh i don't have the freqs yet, i have sample indices i'm using numpy's default frequencies these default frequencies, are periods equal to 2, 4, and infinity.
so what do i have here ... micro_ift and micro_ft do something similar to a fourier transform in a highpass way. they can reconstruct a signal that contains those precise highpass components. i haven't tried or considered what would make all th eparts correct, but they're not presently reconstructing a signal that contains imprecise highpass components, including a strong undersampled high frequency. one thing i haven't addressed is whether these strong high frequencies are being interpolated in the same ways between the functions. ideally, the data represents the same underlying thing, and is sampled at the same points, and modeled in the same ways, such that it returns exactly. in the micro_*ft functions, a sequence of sinusoids are used to model the signal, between a minimum and maximum frequency. the minimum frequency is determined based on a passed maximum period, and this is linearly scaled to something like the maximum frequency. i'm guessing that i can discern the correct scaling to reconstruct the extracted signals, and consider whether or why they will be reconstructed exactly, by thinking about the frequencies used in micro_*ft, and the angles of the resulting sinusoids, at the points in the signal where they are sampled. each point in the simulated recording has a matching point in the underlying waveform, that i am trying to reconstruct. for each of those points in both the recording, and the waveform, there is a set of sinusoids and angles that are multiplied by the data, to form a new linear combination of its parts. if those sinusoids and angles are the same, the data will be reconstructed. so there's a shuffling challenge here. 4 angles of 2 sinusoids, taken from 4 frequencies at 4 time points and 4 offsets, need to be the same on both sides. it's 32 total sinusoids: 2 for cos and sin, 4 for the indices, and 4 for the frequencies. cos and sine are just encoding an angle, so it's 16 total angles. 4 indices, and 4 frequencies. each frequency shifts by the index. if we consider the 1/period form of the frequency, then the value that must match is like index * frequency. so 4 indices, each times 4 different frequencies, must match for both the input and the output, for the signal to be reconstructed. the difference is that when the input is sampled, the indices are sequential. whereas, when the input is reconstructed, the indices (mapped from the recording back to the waveform) are interleaved. the different frequency mappings would kind of need to be able to undo that interleaving using the modulo behavior of sinusoids.
this is a weak point for my because it is analytical rather than rote, so i can engage my triggers much more readily. it uses more storage of imagined concepts and creative considering.
maybe i can write formula for the two different sides, and their equivalence indices * frequencies = indices * frequencies the sampling indices are arange(N) the recording indices are (arange(N) * (N * 2 - 1) / (N - 1)) % N since everything is exp(2 pi i * index * freq), indices * frequencies is taken % 1, because sinusoids are % 2pi. only the fractional component of indices * frequencies matters. ...
it might help to think of the overlap at some point wavidx == N, the recording indices restart meanwhile, there are 4 frequencies, that are repeatedly and continuously restarting. at wavidx == N, we might want all those frequencies to align, so that they restart and match the same points. but wavidx == N is never actually sampled, so it could go a different way. one could also add more modelling to the system, and decide the waveform is made of specific underlying sinusoids. -- maybe it could be helpful to consider sample 1 at sample 1, when sampling, a bunch of frequencies are scaled by 1 and a multiplier. sample 1 breaks into those N frequencies, multiplied by a sampling multiplier meanwhile, when reconstructing, sample 1 comes from N frequencies, which are all multiplied by index 1 ...? things actually happen a litle more complexly there are 4 passes of frequencies and indices. first, the fft in the sampling, then, the loop in the sampling, for every sample then the microft then the microift . so "sample 1" is repeatedly spread everywhere, and reconsolidated again, or vice versa. the waveform is projected in a modular way through the recording, then back through the microfts. the sampling and the microft use two different sets of frequencies to transform the data. in the sampling, the waveform is considered a sum of even frequencies. in the microft, the recording is considered a sum of denser frequencies. is it sufficient for these even frequencies to equate to the same portions of the waveform that the denser frequencies do? it seems at least a step. if they are the same frequencies in the waveform, then they simply need to be combined with the same indices in the waveform. so it _should eventually work_. because when sampling the waveform into the recording, these can be the same waveform indices as when reconstructing it. so how do we have frequencies in the recording, that equate to the same regions of the waveform? it sounds confusing, but it's highpass, so we just need to use the same actual frequencies, and scale them down to the scaling of the waveform . i think! it's real-valued data, so the negative frequencies in np.fft.fftfreq can be ignored. the micro_*fts , however, may want to process these negative frequencies, a quick approach could be two exp calls
i was working with micro_ft and micro_ift to make their output identical to np.fft.fft when the np.fft.fftfreq frequencies were passed in. i'm thinking now it makes sense to make the highband behavior the same. basically, it's the np.fft.fftfreq frequencies for the small waveform ... i'm on the fence ... .... 1516 i'm having trouble making my inverse real microft function match np.fft.ifft . i have the forward real valued one making the right output. i haven't yet actually figured out how rfft works, but i know itshould be pretty simple. engaging dispersal.
(Pdb) print(inspect.getsource(np.fft.irfft)) @array_function_dispatch(_fft_dispatcher) def irfft(a, n=None, axis=-1, norm=None): a = asarray(a) if n is None: n = (a.shape[axis] - 1) * 2 inv_norm = _get_backward_norm(n, norm) output = _raw_fft(a, n, axis, True, False, inv_norm) return output -> N is (N - 1) * 2, at least i guess i'll take it slower. i know i've made ifft work, so making irfft work should be pretty simple from that.
1606 i got irft to match numpy [i partly cheated and trimmed the number of frequencies to match numpy's, skipping more thoroughly understanding the fourier transform] now to scale its frequencies to match the waveform's frequencies freq2period : frequencies are fractions of the total number of samples, so a frequency of 0.5 is a period of N. max_period is already set using the idx_wav2rec function, so it should represent what's right okay ... um ... so the question is, what is the waveform frequency, in the simulated recording? it has max_period, so the frequency would be max_period / N; its first fourier component would be at 0.5 * max_period / N, right? no, not quite: 0.5 is the nyquist frequency ... not sure ihave these frequencies right. 0 is infinite period 1/N is a period of N. freq = 1/period . so the first non-DC fourier component would be 1/max_period, I think ... [and the phase of this could be used to align with it] looks like that's what it already is maybe it'll work better at this time. 1613 1620 it doesn't reconstruct the subwave yet, even when i set it to exactly half period. it looks to me like the reason it doesn't reconstruct it would be simliar to the reason that the fft/ifft doesn't invert for me, if i add extra frequencies.
why does the fft work? i don't quite remember, not sure if i ever knew. it clearly forms a linear recombination of the parts via sin(). each value is converted to an angle and scaled by many indices and frequencies then what? why is each operation a scaling by these things? both the time domain and the frequency domain are scaled by e^(n*k), where n is the index, and k is the frequency so basically, there's a matrix of sinusoids, that multiplies the data. when multiplied twice a certain way, it undoes itself. each item gets multiplied by every frequency, and distributed then each of these is again multiplied by every frequency, and distributing it again restores the signal what path does a single value go through? it gets rotated to an angle, then summed with a bunch of stuff then it gets rotated again, and resummed one view is that the rotations undo each other. this seems reasonably likely. i wonder if i can do an example by hand. i can use numpy primitives to make this more likely to succeed. 1629 this shows the boomerang through the matix of angles
mat = np.exp(np.outer(np.array([0,1]), 2j * np.pi * np.fft.fftfreq(2))) mat array([[ 1.+0.0000000e+00j, 1.-0.0000000e+00j], [ 1.+0.0000000e+00j, -1.-1.2246468e-16j]]) v=np.random.random(2) v array([0.16463984, 0.36013487]) ((v * mat).sum(axis=1) * mat.T).sum(axis=1) /2 array([0.16463984-2.20519010e-17j, 0.36013487+3.40225191e-17j])
the matrix is not complex. it's basically [[1,1],[1,-1]] it contrains only the DC offset and the nyquist frequency the DC offset turns into an average when 1 is multiplied by all components the nyquist frequency multiplies -1 by the 2nd sample, and 1 by the first, and sums them. so how does that undo? there's a sum, and a difference. to restore the values, the difference is added to the sum (via an average) and the difference is subtracted from the sum (via another difference) basically, the values have been turned into differences. maybe i'll add one more frequency 1633
np.fft.fftfreq(3) array([ 0. , 0.33333333, -0.33333333]) mat = np.exp(np.outer(np.array([0,1,2]), 2j * np.pi * np.fft.fftfreq(3))) mat array([[ 1. +0.j , 1. +0.j , 1. -0.j ], [ 1. +0.j , -0.5+0.8660254j, -0.5-0.8660254j], [ 1. +0.j , -0.5-0.8660254j, -0.5+0.8660254j]])
the first row and column are again a sum but the other parts are done differently. it's notable that the matrix is symmetrical, and the lower 4 items are complex numbers that all have the same absolute angle (i checked this with np.angle()) draft saved at 436p they all also have the same magnitude of 1.0 so one could recast the 2-valued example as an angle, rather than a negative value. a negative value has an angle of 180 degrees. when multiplying by it, the item was rotated 180 degrees. summing it with the other item then produced a vector sum. multiplying back, the vector sum or normal sum was rotated 180 degrees, and summed again. walking back along the same old vectors. this new matrix has angles of 120 degrees (2.094 radians). in each column or row, one of them is 120, and another is -120. i'm wondering if each item is unrotated by its previous angle, when the transposed matrix is multiplied, not sure. 1639 in the time domain 3-valued vector, the first component is at angle 0, the second at angle -159deg, and the third at angle 159deg. i infer the 159 degrees is the angle you get when you go at 120 degrees by one amonut, then at -120 degrees by another amount. so it has a sum, and a walk of three steps, one at 0degrees, one at 120, one at -120. then it has a walk of 0, -120, and 120. if I use the old mat.T code to invert this 3-valued fftish, i get the second and third values swapped. i can get them in the right number if i negate the ifft and the output. i recall from the formula that it's actually the angles that are negated. is that the complex conjugate? yeah, using mat.conj() instead of mat.T gets the right answer
v array([0.0335134 , 0.56891046, 0.869467 ]) ((v * mat).sum(axis=1) * mat.conj()).sum(axis=1).real / 3 array([0.0335134 , 0.56891046, 0.869467 ])
1645 so, .conj() will negate all the angles. a single value is moved 0 degrees, 120 degrees, and -120 degrees, into frequency space. then it does this again, by the negative angle, to move back into the time domain, i think? i'm considering 0.56891046. it's multiplied by 1, and by 120 degrees, and by -120 degrees. each element of the output is composed of a different pairing of motions with values. the first output is all the values moving at 0 degrees, summed. the second output is the first value moving at 0, the second value at 120, and the third at -120. so the second value moves at 0 into the first output, at 120 into the second, and at -120 into the third. when this is undone, the angles are negative. so, it moves at 0 again in the first output, at -120 into the second, and at +120 into the third. each of these motions is added onto the motions of the other values, and it starts at zero. k close here, hard. 1651 when the second value is reconstructed, it's taken from all the other ones. .... and this happens for every row of the matrix, not just one. so, considering just one value again, it goes into every output. then, the second output, is taken from every value. the first value goes by 1 into the second output the second value goes by -120 into the second output and the third value goes by 120 into the second output or somesuch so that output by 1 gets returned by 1, using the other side of the matrix the output by 120 is returned by -120 and the output by -120 is returned by 120 each of these angles restores the complex value to its full value again and then the result is divided by 3 to reduce the magnitude of having them all there. so why does or doesn't it work to add a wonky angle in there? 1656
mat = np.exp(np.outer(np.array([0,1,2]), 2j * np.pi * funny_angles)) mat array([[ 1.000000e+00+0.0000000e+00j, 1.000000e+00+0.0000000e+00j, 1.000000e+00-0.0000000e+00j], [ 1.000000e+00+0.0000000e+00j, 6.123234e-17+1.0000000e+00j, 6.123234e-17-1.0000000e+00j], [ 1.000000e+00+0.0000000e+00j, -1.000000e+00+1.2246468e-16j, -1.000000e+00-1.2246468e-16j]]) ((v * mat).sum(axis=1) * mat.conj()).sum(axis=1).real / 3 array([0.0335134 , 0.39044477, 0.94774717]) v array([0.0335134 , 0.56891046, 0.869467 ])
np.angle(mat) * 180 / np.pi array([[ 0., 0., -0.], [ 0., 90., -90.], [ 0., 180., -180.]])
these frequencies i made up, don't counter each other when turned into angles. the matrix is no longer symmetric. i tried a higher set of frequencies, and the default matrix stays symmetrical. it really seems like i'd need that matrix to be symmetrical, so each of the values can return to where it came from. 1659 i'm noticing that the angles are evenly spaced around a circle. that sounds important. in fact, it's hard to come up with a way to do this evenly spaced around a circle, that is any different at all from what is already there. 1701 in my failing code, i have things that are purportedly evenly spaced around a circle, although i could have made a mistake, but there is a differing number of them than the number of indices.
so what is that like; what am i actually doing, and why doesn't it work? i basically am producing fft output that is much larger than the input. i'm probably multiplying by too few indices, and effectively producing a rectangular matrix that is not symmetrical and doesn't produce the back-migration needed to invert successfully. 1704 if i were to form the entire square matrix, how would i then multiply it by the data? this is a matrix for a 5-long vector:
mat = np.exp(np.outer(np.array([0,1,2,3,4]), 2j * np.pi * np.fft.fftfreq(5))) np.angle(mat)*180 // np.pi array([[ 0., 0., 0., -0., -0.], [ 0., 72., 144., -144., -72.], [ 0., 144., -73., 72., -144.], [ 0., -145., 71., -72., 144.], [ 0., -73., -145., 144., 72.]])
interestingly, the angles actually change if i provide the frequencies in a different order, but they stay balanced with negative signs around diagonal lines
v * mat Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,) (5,5)
well, that's a minor issue. so, if i want to successfully reconstruct things here, i'll need each angular behavior to be countered by an opposing angular behavior. i'd like to include all the frequency data! blrgh 1710 - i have 5 frequencies i want to think about - there are only 3 inputs - usually, each input is multiplied by each frequency. that's okay. - the output is composed of what? each item summed, for each frequency, right? i played around a little and it actually works fine if a rectangular matrix is used but a different region of the matrix is selected for the input and the output and, notably, the denominator is the number of total frequencies, not the number of values.
v array([0.0335134 , 0.56891046, 0.869467 ]) ((v * mat[:,:3]).sum(axis=1) * mat[:3,:].conj()).real.sum(axis=1) / 5 array([0.0335134 , 0.56891046, 0.869467 ])
so, that denominator situation is probably the issue in my code. i can also check the frequency products.
okay it is of course actually a different situatoin from that i'm trying to extract only part of the data, which has gotten all mutated. but now i remember, i found the problem was similar to a different problem, of simply having extra frequencies. so this solves the extra frequencies issue, i believe. maybe i should just implement that, and see how it works. 1729 things are complicated a little because i implemented the rfft thing it makes sense to convert the data to a normal complex fft before processing it, so i can compare with what i just went through 1752 i started making things the same between what works and what doesn't. a difference i ran into is that the starting angles of the broken code start much larger? i think this basically means it's not evenly spaced around a circle: but it is symmetrical about 0 degrees. i don't know whether that works, i guess i'll look at it. this would be so much easier if i could reasonably imagine these things without my issues i guess i imagine very slowly now looks like my frequency generation is missing frequencies! 1757 1803 well i'm narrowing in on the issue. presently it's generating frequencies starting at a large, rather than small, value, and the resulting matrix is not symmetric. understanding why is again difficult for me. ~ to help me figure this out, i'm trying out generating more frequencies than needed. but it's still only generating frequencies for the passed band, and now that there are more it starts off late.
np.angle(np.exp(np.outer(np.arange(4), 2j * np.pi * np.array([0,.25,-.5,-.25])))) * 180 // np.pi array([[ 0., 0., -0., -0.], [ 0., 90., -181., -91.], [ 0., 180., 0., -181.], [ 0., -91., -180., 90.]]) np.angle(np.exp(np.outer(np.arange(4), 2j * np.pi * np.array([0,.75,-.5,-.75])))) * 180 // np.pi array([[ 0., 0., -0., -0.], [ 0., -91., -181., 90.], [ 0., 179., 0., -180.], [ 0., 89., -180., -90.]])
it's making matrices like the second, rather than the first, where one half is off from the other half. it uses this to decide what frequency to start at: min_k = tN / max_period # max_period == tN start_freq = min_k / tN meanwhile the number of frequencies is fN > tN . so it ends up starting with a short period. i'd like to understand why that makes an issue. 1806. ... and see what options are available for focusing the frequencies on a band. if that have to be that dense across the entire scale, so be it. when we generate the matrix, it considers all 4 frequencies, and all 4 offsets. the offsets go across an entire period (2pi), so if the frequencies are not arrange right with each other, some of the resulting timepoints are disaligned. 1808 each frequency is scaled by all the offsets, and that's how the matrix is made. do we need all these offsets? the matrix could be symmetric by design. it looks like it just needs a unique set of angles that balance each other. i wonder. i tried with simply symmetrical angles, and it didn't work. i tried with angles that were even multiples but didn't go around a circle, and it didn't work. i'm a little curious as to why this is, but maybe for now it's good enough to know the angles must go around a circle. i'm looking at N=4 . the angles here are -90, 0, 90, and 180. the frequencies are -2/4, -1/4, 0/4, 1/4 . so, how do these frequencies make angles that extend evenly around a circle? there are 4 scales of frequencies, and 4 total lengths ... each frequency is multiplied by each offset. so 0 goes by 0, 1, 2, 3 -> all get 0 1/4 goes by 0, 1, 2, 3 -> this extends around a circle 1/2 goes by 0, 1, 2, 3 -> this engages opposite ends of a circle twice -1/4 goes by 0, 1, 2, 3 -> this extends around a circle in the opposite order the opposite order produced by -1/4, opposing +1/4, and bounded by 1/2 and 0, makes the symmetry around a circle in every direction in the resulting matrix. the negative and positive frequencies need to evenly distributed between 0 and 1/2, so that the matrix inverts itself when summed the other way. 1823 so if i want a bandpass, i need to consider the entire matrix, that extends around a whole circle but i can possibly assume that the parts that are outside the band are not needed this is something people know well already. you can take the fft, and set output portions to 0, and then take the ifft, and you get a band pass (i think! i dunno). so for the start of making it make sense, i would be making a much larger frequency matrix. and basically i propagate all the zeros in properly if i want to make it smaller. maybe! i might be missing something. so if max_period is 8, and there are 14 frequencies, what does that look like or mean? it's confusing. it's easier if max_period is 7 and there are 14 frequencies. then they can be evenly distributed. it's of course possibly fine to have 16 frequencies. i think i'm missing a couple understanding here, to figure out how to move forward with possible appropriateness. - the frequencies work best if they are evenly distributed between -1/2 and 1/2 . - the signal, in theory, contains no information above a set frequency - the set frequency is not a multiple or factor of the data length - the data length and the frequency count do not need to be related. the frequencies just need to be balanced. so, there are a lot of low frequencies that can get included when done normally. I'm thinking a little about low frequencies, and how things walk angles. i wonder if the angle walking can sort out with them. if i consider the waveform in the frequency domain, then it's made of a bunch of complex-valued things. when it's viewed in the recording, it's as if it's frequency shifted. all the complex-valued things sit at different frequencies than they did before. their indices in an fft are slid up or down. but they are still at the same angles. when the fft or ifft act, they multiply everything by their angle matrix, rotating all the values and recombining them. the matrix needs to have a positive angle that balances every negative angle, summing it needs to work a certain way. [it looks like the angles need to form an even circle too, which i haven't understood why] but it seems like there would be some way to consider this, and account for the presence or lack of information. i think of these things so incredibly slowly ... so what happens if we just have a bunch of evenly distributed frequencies? how does that work for finding one inside another? we'd want all the frequencies of the waveform so our N gets squeezed very small then we want to cover the whole circle of frequencies originally the distance between frequencies is 1/N but now we want it to be max_period_ratio / N . so i think it's doable. i'd consider the frequency circle as being scaled from N to N * max_period ... or something like that. i'd break it into enough tiny chunks such that size is right then i'd extract only the region of interest, assuming the transform is zero elsewhere. so what is this N * max_period thing. what should it actually be. the frequencies are distributed between +-1/2 over (N+1) using N there is appropriate if max_period == N. if max_period < N, then what? we possibly need more frequencies. say max_period is N/2 . then we imagine N*2 frequencies, instead of N. new_n = old_n * 2 # max_period = old_n / 2 it looks like new_n = old_n / (max_period / old_n) = old_n ** 2 / max_period say max_period is old_N/3, then we imagine N*3 frequencies new_N = old_N**2 / max_period = old_N**2 / (old_N / 3) = old_N * 3 . ok i have fN = tN * tN / max_period what happens if max_period is very near tN? then fN = tN * tN / (tN * 0.99) fN = tN / 0.99 it just scales up fN a little bit. are the periods the right length, in the imagined extended time sequence? if max_period = tN * 0.99, then the space between samples to recover each waveform sample is ... what ... uh ... there are N samples within max_period, and the data is N samples long. max_period / N is the amount of space, in recording samples, we want between time points. it's <1, as it should be. so that's tN * 0.99 / tN = 0.99 and then fN = tN / 0.99 so fN * 0.99 = tN it works out. the right distance is between the samples, because there are tN samples in a waveform length of data. ok, so basically i generate the same frequencies, but i generate many more, according to , um ... fN = tN * tN / max_period . 1842 1853 it's notable that filtering in the frequency domain does not equate to leaving frequencies out, at least because they still contribute to the sum that things are scaled by in the inverse 1858 i found an inaccuracy in the approach. with an arbitrary scale, the number of frequencies can become noninteger. this may mean my test data will fail unless i ensure it is an integer. 1859 it's processing the basic normal fft data accurately now that it wasn't succeeding at before. i'm not filtering the fft, just using the whole thing. 1901 it's still not reconstructing the smaller wave right. i think the loop for checking is much tighter now, though. the points at which the wave is sampled should be the same distance apart. 1904 looks like it's still generating freqs in a circle when a max_period is specified. 1908 the reconstructed waveform has samples equal in value to ones from original waveform again. it's still set to the simple N/2 period. when i set the output size to the length retained in that waveform (N/2), it outputs it exactly correctly. so that's progress. i'm not expected as much from the variant that is offset by half a sample. ok, max_period = 3.73333 and tN = 8 . the frequencies aren't exact, tN * tN / max_period is nonintegral. the max_period is ideally a factor of the square of the input length. is that doable? 3.73333 = 8 * (8 - 1) / (8 * 2 - 1): 8 *7/15 uhhh i could just scale up the input length so there's a 7 factor in there. wonder if it gets too big. oh, the 7 is a function of th einput length ;p 1917 i doubled the recording length, which happened to meet the factor-of-square restraint. i forgot to increase the output size, so it was still squeezing it into half length. the "reconstructed" output was exactly equivalent to the first half of the wave. the second half is nowhere to be seen. maybe an artefact of the wavelets-of-self situation. also an artefact of making max_period this value ... it ends up not scaling the waveform at all! it's pasted right in :S 1922 i tried out removing the assertion that the period be a fraction of the total length, and when it isn't it's doing the same old thing of having the first and last sample match but not the ones in the middle. if the period is a fraction of the total length, how do we use moireness to get other parts of the signal? it will just divide evenly ! somehow it actually recovers the signal under those conditions, which is strange. oh this makes sense because it is not downsampled in the condition i tested. hum ......... 1927 this part seems difficult. - it samples the signal evenly in the time scale of the fake recording - it does this via a fourier transform, so atm it is setting a frequency for every combination of waveform offset and recording offset, but a filter can seriously reduce those. it eases confused analysis. - there must be some situation where the signal slides along under the recording, and its parts can be picked up. it could slide along in some kind of even multiple - this kind of could mean a lot of subsamples. for example if the waveform is wN long and the recording is cN = wN * 2 + 1 or cN = wN * 2 - 1, then there is some kind of disagreement between the two. say the waveform and recording are both N long, but the max period is 1/(N+1) or 1/(N-1) whatever is needed so that the a single sample is made of every period, and each one is offset in the wave by 1/N . so the subwave samples in the larger sequence have a frequency of like 1/N^2 or something? max_period = 1/N .. o roff a little bit ok so if on the 2nd sample, i want the 2nd sample of the 2nd waveform to pop through, then i would need to advance along the waveform by N + 1. so 1 unit in the larger recording, is N + 1 for the waveform. then the period size is .... i'm not sure. a period is N. 1 is N + 1. so period length is 1 / (N + 1) * N = N / N + 1 . 1934 ok if i am starting with period length, then i'll need to form the sampling indices based on that. maybe i can just calculate them. ha, this is ridiculous: i directly sample the wave onto the recording. it looks identical. the system does figure out the max_period of 0.8888888 and produce the right output though. ok, so now something that makes the data get shuffled around. some amount that it is advanced every waveform can i maybe make it shorter ... N / (N - 1) ? so it progresses backward? this makes sense to try because it means making that tiny period larger. now max_period is 1.142857142857 unfortunately, the output is wrong. i'm thinking when i sample it i need to use the same frequency count, or it's a different wave being interpolated from and sampled from. 1939 no, that doesn't seem to be it. blarg dangit. i should probably low-pass the waveform and see if it comes out similar to what it was before. okay, it's still not looking good with lowpassing, and that makes sense, because i still have all the low frequencies in there and it's only sampling a single period, so it gets all the other artefacts overlayed. 2008 ha i'm actually ignoring the upsample flag entirely now. the code was short circuited when i switched to the fN approach. my goodness. 2014 sadly it's not working. a gain is that i built further thoughts around FFT. it's quite hard to engage my cognitive issues while experiencing strong pressure to stop working on something. i think i can improve at this.
notes after a little sleep: - you can see that this should work by imagining the waveform defined in analytical frequency space, as a sum of sinusoids at precise frequencies. [makes a clear an accurate fourier transform via many different methods]. removes aliasing artefacts. [noise can then be added to the recording, and lengthening it shown to improve the snr] - can also see it should work given we have shown denser frequencies than needed work. we would define the unit circle as the ceiling of the max period. if the data length is cropped to be an even multiple of this, then it can be extracted normally. - using random data to test with without defining its analytical form complicates the situation, adding unneccessary extra challenges. - having the recording be an even fraction of the waveform does not mean that the subsample data is lost: because this does not imply it is a whole number of samples. this is done by setting the recording length to a wonky value, and the max_period to an even fraction of that value. for example, if recording_N is 99, and max_period is 9.9, then it fits 10 times and offsets its phase by 10% each instance. this can also work with recording_N 990 and max_period 99. - one can also see it should be workable by considering those long, stretched instances, where max_period > waveform_N * 2. considering this builds reason to test with an analytic shape such as a sine wave or a square wave. it shows how the interpolation plays out. - my sense, from past experience, is that if i want this to work right, i'll want to crop the recording length to be an even multiple of the max_period. this works if the max_period is an integer. having some difficulty simplifying the problem space to use an integral max period and an analytically-defined waveform. i don't seem to choose to do these things. still, it's nice to list out two or three ways to show that it should actually work, and that these ways have clear implementation paths. struggling with this with result helps engage my algorithmic inhibitions. most of my success yesterday was in manually making fourier and inverse fourier transforms, notably with denser frequencies than the data required. a next step could be accumulating a fourier transform over extended data -- a high pass, but such that there is still a complete unit circle. basically, the max_period would be selected so as to be a fraction of the whole thing already. the reason the high pass max_period = (tN / (tN // max_period)) works is because it is equivalent to summing individual max_period fourier transforms (where max_period is an integer, or recording_length is cropped), each one offset in phase by a constant, and each of those individual transforms would show an individual downsampled copy of the waveform. this is only the same as analogous successes if max_period is an integer fraction of the recording length.
it's seemed good for me to be able to spend so much time pursuing this, despite such i'm just poking at the details of what a dft is more. i found that this expression of moving v into frequency space and back into time space: ((v * mat).sum(axis = 1) * mat.T).sum(axis=1) / 2 can be expressed more simply with basic matrix multiplication: v @ mat @ mat.T / 2 much more concise! 0500 so of course the above v @ mat @ mat.T / 2 implies the matrix's transpose is its inverse. it looks like the inverse is the transpose over 2:
mat.T array([[ 1.+0.0000000e+00j, 1.+0.0000000e+00j], [ 1.-0.0000000e+00j, -1.-1.2246468e-16j]]) np.linalg.inv(mat) array([[ 0.5+3.061617e-17j, 0.5-3.061617e-17j], [ 0.5-3.061617e-17j, -0.5+3.061617e-17j]]) np.linalg.inv(mat) array([[ 0.5+3.061617e-17j, 0.5-3.061617e-17j], [ 0.5-3.061617e-17j, -0.5+3.061617e-17j]])
np.linalg.norm(mat) 2.0
The convention, with fft, is to divide by the frequency count after returning to the time domain, but it looks like things could be optimized if the frequency data were already treated this way:
v array([0.71733727, 0.82679766]) v @ (mat/2**.5) @ (mat.T/2**.5) array([0.71733727-5.06267555e-17j, 0.82679766+5.73292713e-17j]) v @ (mat/2**.5) @ np.linalg.inv((mat/2**.5)) array([0.71733727+0.j, 0.82679766+0.j])
that is, if the frequency product matrix is divided by the square root of the frequency count, then the discrete fourier transform is simply a matrix multiplication. 0505 i'm thinking of trying going through by hand a very simple thing, to try to look at the details of angle selection better.
websearch result: "If inverse of the matrix is equal to its transpose, then it is an orthogonal matrix" 0508 https://en.wikipedia.org/wiki/Orthogonal_matrix
In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.
https://en.wikipedia.org/wiki/Orthogonal_matrix#Properties
A real square matrix is orthogonal if and only if its columns form an orthonormal basis of the Euclidean space Rn with the ordinary Euclidean dot product, which is the case if and only if its rows form an orthonormal basis of Rn.
I infer the rows and columns must all be at 90 degree angles to each other. In the complex domain, where each value is itself a 2-vector, that's a little confusing. I'm considering thinking of these orthonormal vectors as being planes, which are perpendicular to each other in 4-dimensional space, but maybe it will be clearer if I look straight at the numbers, maybe later. Anyway, with the idempotent transforms it makes sense the vectors are orthogonal and define a coordinate space. The article says that all orthonormal matrices can be shuffled so as to be a diagonal of 2x2 rotation matrices, with an extra 1 on the diagonal if the row and column count is odd. https://en.wikipedia.org/wiki/Orthonormality
two vectors in an inner product space are orthonormal if they are orthogonal (or perpendicular along a line) unit vectors.
https://en.wikipedia.org/wiki/Orthogonal_matrix#Overview
Orthogonal matrices preserve the dot product,[1] so, for vectors u and v in an n-dimensional real Euclidean space u dot v = (Qu) dot (Qv) where Q is an orthogonal matrix.
This implies the dot product of values in frequency and time space are off only by a constant factor.
np.dot(v,v) 1.198167136558938 np.dot(v@mat/2**.5, v@mat/2**.5) (1.1981671365589377+1.1083248738116542e-17j)
The angle between values is the same. This makes sense, since everything is simply in a different coordinate basis. 0524 https://math.stackexchange.com/a/835837
Every real Householder reflection matrix is a symmetric orthogonal matrix, but its entries can be quite arbitrary.
In general, if $A$ is symmetric, it is orthogonally diagonalisable and all its eigenvalues are real. If it is also orthogonal, its eigenvalues must be 1 or -1. It follows that every symmetric orthogonal matrix is of the form $QDQ^\top$, where $Q$ is a real orthogonal matrix and $D$ is a diagonal matrix whose diagonal entries are 1 or -1.
In more geometric terms, such a matrix is an orthogonal reflection through a subspace. I.e., if $A$ is symmetric and orthogonal, then $P:=\frac12(A+I)$ is an orthogonal projection, and $A=2P-I$ is the reflection through the image of $P$.
not immediately sure what all that math syntax represents. browser renders it if i turn on javascript from cloudflare.com or such.
if A is symmetric and orthogonal, then P := 1/2 (A+I) is an orthogonal projection, and A = 2P−I is the reflection through the image of P.
You can construct orthogonal and symmetric matrices using a nice parametrization from Sanyal [1] and Mortari [2].
One can construct such a matrix with a choice of n orthogonal vectors {r_k}k=1..n and the desired number of positive eigenvalues p∈[0,n] R=∑k=1..p (r_k * rT_k) − ∑k=p+1..n (r_k * rT_k) They also point out that
if p=n, then R=I whereas if p=0, then R=−I.
I think it's stated a couple different ways that a symmetric orthonormal matrix is a statement of N orthogonal vectors and a selection of p which are positive, the rest of which are negative. I don't know why positive/negative is separated out, instead of just saying it is N signed orthogonal vectors. I tried this with my "v" vector and a perpendicular vector (v[1],-v[0]), but i may not have been quite sure what i was doing. https://en.wikipedia.org/wiki/Householder_transformation
a Householder transformation (also known as a Householder reflection or elementary reflector) is a linear transformation that describes a reflection about a plane or hyperplane containing the origin.
householder matrices are defined in terms of the outer product of a normal vector with its conjugate (doubled and subtracted from the identity matrix) there is a lot that can be learned here! 0546 I want to think more about the construction of the matrix from time points along sinusoids.
np.fft.fftfreq(2) array([ 0. , -0.5]) freqs = np.fft.fftfreq(2) mat = np.exp(np.outer(np.array([0,1]), 2j * np.pi * freqs)) mat array([[ 1.+0.0000000e+00j, 1.-0.0000000e+00j], [ 1.+0.0000000e+00j, -1.-1.2246468e-16j]])
Each complex sinusoid is a spiral in the complex plane over time, from 0 to ....? The first row and first column evaluate this spiral at time point 0. The second row and second column evaluate this spiral at time point 1. There's only one spiral in the above example, since one of the frequencies is 0. Spiral defined by sin_or_cos(t * -0.5 * 2pi) ... so when t is +-2, it has cycled. The spiral goes from 0 to -2. evaluating it at 1.0 evaluates its negative 180 degree point, which should be -1.
mat.real array([[ 1., 1.], [ 1., -1.]])
right, okay i guess. regarding the columns and rows, one of them represents moving along the selected time points (0.0, 1.0); the other represents moving along the selected frequencies (0.0, -0.5) . One could normalise these to make them more equivalent:
mat = np.exp(np.outer(np.array([0,1])/2**.5, 2j * np.pi * freqs*2**.5)) mat array([[ 1.+0.0000000e+00j, 1.+0.0000000e+00j], [ 1.+0.0000000e+00j, -1.-5.6655389e-16j]])
np.array([0,1])/2**.5 array([0. , 0.70710678]) np.fft.fftfreq(2)*2**.5 array([ 0. , -0.70710678])
I don't know what that would mean at higher N's, though. just something notable here. Thinking of how the length of these vcetors is all sqrt(N). back to [0,1] and [0,-.5] . Two sinusoid spirals. A matrix made from them. The matrix contains orthonormal vectors. When I was trying funny offsets, I was probably constructing a matrix that did not contain orthonormal vectors.
np.exp(np.outer(np.array([0.125,1.25]), 2j * np.pi * freqs)) array([[ 1. +0.j , 0.92387953-0.38268343j], [ 1. +0.j , -0.70710678+0.70710678j]])
Here I evaluate the spirals at 0.125 and 1.25 . I'm imagining that the data is viewed stretched and offset by 1/8th of a sample. Maybe keeping it pinned to zero would be more clear:
np.exp(np.outer(np.array([0,1.125]), 2j * np.pi * freqs)) array([[ 1. +0.j , 1. -0.j ], [ 1. +0.j , -0.92387953+0.38268343j]])
v @ mat @ mat.T / 2 array([0.74880538+0.15820088j, 0.73301797-0.15506057j]) v array([0.71733727, 0.82679766])
The matrix's transpose is no longer its inverse. It doesn't contain orthonormal vectors.
np.dot(mat[0], mat[1]) (0.07612046748871315+0.38268343236508967j) abs(np.dot(mat[0], mat[1])) * 180 / np.pi 22.355704150744625
Maybe they're offset by 22 degrees; I'm not sure what's meant by a complex angle. Normally, the dot product is 0:
mat=np.exp(np.outer(np.array([0,1]), 2j * np.pi * freqs)) np.dot(mat[0], mat[1]) -1.2246467991473532e-16j
Something that's notable about a matrix that isn't orthonormal and symmetric is that it still has an inverse.
v @ mat @ np.linalg.inv(mat) array([0.71733727+0.j, 0.82679766+0.j]) v array([0.71733727, 0.82679766])
So, the resulting "spectrum" can still technically be used to recover the original data, by precisely undoing the wonky transformation it performed. 0602 I'm thinking of the meaning of applying the wonky transformation matrix from the weird points of frequency evaluation, and I'm guessing that it might still be a spectrum that is output. Not certain ... I'll try with very clear data.
def signal(x): ... return np.exp(2j * np.pi * x * 0.5) ... signal(0) (1+0j) signal(1) (-1+1.2246467991473532e-16j) signal(0.5) (6.123233995736766e-17+1j)
signal() is now a sinusoid precisely aligned with the fourier algorithm. 0605
In the fourier transform, the theorised signals are sampled at their peaks. The def signal(x) function above provides a signal that is +1 at the 0 sample, and -1 and the 1 sample. Multiplying this by the fourier signal gives +1 and +1 at each sample:
signal(np.array([0,1])) * np.exp(np.array([0,1]) * 2j * np.pi * freqs[1]) array([1.+0.j, 1.+0.j])
If the signal is offset, then these products give the phase of the offset:
signal(np.array([0.1,1.1])) * np.exp(np.array([0,1]) * 2j * np.pi * freqs[1]) array([0.95105652+0.30901699j, 0.95105652+0.30901699j]) abs(0.95105652+0.30901699j), np.angle(0.95105652+0.30901699j) * 180 // np.pi (1.0000000021715851, 17.0)
Offseting by 0.1 samples here makes a phase offset of 17 degrees. The output has a magnitude of 1.0 and a constant phase, at both samples. - So, if we have a scaled wave, that is not aligned with the samples, and we want to extract it precisely, we're going to be sampling it points other than its peaks. It will be sampled at a different point in its wave, each time. -> A transform that undoes the results of that product at non-peaks, and outputs a vector of units with constant angles, seems like it would be very useful there. It seems likely to me such a transform could recover the data precisely. <-
[some typing lost]
signal(np.array([0,1,2,3])*1.1+1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[2]) array([-0.95105652-0.30901699j, -0.95105652-0.30901699j, -0.95105652-0.30901699j, -0.95105652-0.30901699j])
There it works with 4 samples: a single phase and magnitude of 1.0 are reconstructed from data with 10% sampling error. The input data is not phase aligned:
signal(np.array([0,1,2,3])*1.1) array([ 1. +0.j , -0.95105652-0.30901699j, 0.80901699+0.58778525j, -0.58778525-0.80901699j])
So it is looking much more doable !! The next wave has half the frequency. How does this work out? How does this combine to lose or recover information? How does it introduce noise? For the test code, what is most relevant is, how do I retain the meaning of the interim data as being a spectrum of the input, and precisely recover the input by processing that spectrum.
signal(np.array([0,1,2,3])*1.1+1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1]) array([-0.95105652-0.30901699j, -0.70710678+0.70710678j, 0.30901699+0.95105652j, 0.98768834+0.15643447j]) np.fft.fftfreq(4) array([ 0. , 0.25, -0.5 , -0.25])
wonky_points = signal(np.array([0,1,2,3])*1.1+1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1]) abs(wonky_points), np.angle(wonky_points)*180//np.pi (array([1., 1., 1., 1.]), array([-162., 135., 72., 9.]))
-162+360-135 63 135-72 63 72-9 63
The output is of unit length, and there are 4 angles offset by 63 degrees. The output with constructive interference, where the frequency was 0.5, all had the same angle, so summing them makes a peak. This output, here, is supposed to have destructive interference. If 63 were 90, then the outputs would all cancel each other and the sum would be 0. Instead, it's 63, so there will be some sum.
accurate_points = signal(np.array([0,1,2,3])+1.1) * np.exp(np.array([0,1,2,3]) * 2j * np.pi * np.fft.fftfreq(4)[1]) abs(accurate_points), np.angle(accurate_points)*180//np.pi (array([1., 1., 1., 1.]), array([-162., 107., 17., -73.])) -162+360 - 107 91 107 - 17 90 17 - -73 90 -73 - -162 89
The original data indeed has 90 degree offsets and completely eradicates itself. Additionally, with 4 samples and 90 degree offsets, the offsets complete a circle and cycle within the data. This is basically the equivalent of destructive interference, since walking vectors in a circle returns to zero. So, rather than stabilising the constructive interference, which is already stable, it seems more interesting to consider the destructive interference with the wave twice as large. With the twice as large wave, the algorithm is relying on the second half of the data canceling out the first. The sinusoid of the longer wave is opposite here, whereas the sinusoid of the shorter wave is equivalent. Because the sampling points differ, the data does not cancel. Here is a more accurate place where it could be transformed, so as to cancel itself out. What is going on that needs transformation? We're multiplying two sinusoids at specific points. One of the sinusoids has half the frequency. I'm engaging a strong internal struggle trying to consider this. I'm guessing part my mind found that part of could be is unlikely to work how I hope, and is pressing back, but I don't know yet why that would be, since I'm struggling so hard to hold it. I'm trying to discern what kind of situation is going on when a wave with half the frequency is compared, at samples that do not align with the zero crossings of the waves, to consider whether the data can be transformed to reproduce the destructive interference of zero crossings.
wonky_points = signal(np.array([0,1,2,3])*1.1+1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1])
what's going on here is the product of two complex sinusoids.
abs(zero_based_wonky_points), np.angle(zero_based_wonky_points)*180//np.pi (array([1., 1., 1., 1.]), array([ 0., -63., -126., 170.]))
Just looking it, it seems like a solution would be to scale the angles. The internet confirms that the product of two complex numbers has the sum of their angles. I'm immediately having difficulty confirming that:
np.angle(np.exp(np.array([30, 40]) / 360 * 2j * np.pi)) * 180 // np.pi array([29., 40.]) np.product(np.angle(np.exp(np.array([30, 40]) / 360 * 2j * np.pi))) * 180 // np.pi 20.0
(0.8660254 +0.5j) * (0.76604444+0.64278761j) (0.342020137568776+0.939692617065294j) np.angle((0.8660254 +0.5j) * (0.76604444+0.64278761j)) * 180 // np.pi 70.0
It looks like that is true, and np.product simply doesn't do what I wanted it to. Ok, so if the angles of the products of these sinusoids are increasing by a constant, this means the sums of the angles of the parts are increasing by a constant. Multiplying the signal sinusoid, by the fourier sinusoid, adds their angles together. I'll look at this again:
zero_based_wonky_points = signal(np.array([0,1,2,3])*1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1]) abs(zero_based_wonky_points), np.angle(zero_based_wonky_points)*180//np.pi (array([1., 1., 1., 1.]), array([ 0., -63., -126., 170.])) np.angle(signal(np.array([0,1,2,3])*1.1)) * 180 // np.pi array([ 0., -162., 36., -127.]) np.angle(np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1])) * 180 // np.pi array([ 0., 99., -162., -64.])
The fourier signal is advancing at half the rate of the measured signal. 0652 When the measured signal is at -162, the fourier signal is at 90. When the measured signal is at 36, the fourier signal is at -162.
-162 - 0 -162 36 - (-162+360) -162 -127 - 36 -163
Because the sampling is offset, the measured signal is advancing by -162 degrees every sample. Meanwhile, the fourier signal is advancing by 99 degrees:
99 - 0 99 -162 - (99 - 360) 99 -64 - -162 98
I'm missing something: why does the angular rate of the fourier signal not appear to be half the angular rate of the measured signal? Some aliasing effect? Oh, actually it is:
99 * 2 198 -162 + 360 198
So the faster signal is advancing by 198 degrees/sample, and the slower signal is advancing by 99 degrees/sample. When we multiply them, we get a sequence of unit vectors that advance by -63 degrees/sample. This is the same as +297 degrees, which is 198 degrees + 99 degrees:
-63+360 297 99+198 297
When the signals align, the faster frequency advances by 180 degrees, and the sum advances by 0 degrees. The slower frequency advances by 90 degrees, and the sum advances by 270 degrees, which results in it hitting -90, 180, and 90 with modulo 360. If the samples are aligned, the sum advances by 270 degrees. Here, in the disalignment, the sum instead advances by 297 degrees. Performing a transform on this row that converts 270 degree angle advancements to 297 degree advancements could help. Further considering of combinations of these could help figure out a formula for that transform. But is it even reasonable to scale a sequence of complex numbers such that they express a rotation at a different rate? I'm thinking that could be done a further complex multiplication. 0704 If these all advance at 297 degrees, then maybe I could divide them by appropriate exponentiated complex numbers to make them align to 0 degrees, then multiply them again to bring them into 270 degrees. Not sure. I'm thinking of how to design an exponent that increases by 297 degrees each time. I know I just wrote this to display it, but I'm still confused around it. If I want the complex number in the base of the exponent, then the number I would use would be the one at 297 degrees.
np.angle(np.exp(297 * np.pi / -180j) ** np.array([0,1,2,3])) * 180 // np.pi array([ 0., -64., -127., 170.])
I'm using power operator to sort my thoughts out, but of course it would be more efficient to simply multiply the angle before taking the complex sin with the exp function. Another question is, where does this angle 297 come from? This must relate to the frequency difference of the original data. That puts all the parts together. There was something I wanted to consider to ensure the approach had a reasonable likelihood of being valid.
zero_based_wonky_points = signal(np.array([0,1,2,3])*1.1) * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1]) abs(zero_based_wonky_points), np.angle(zero_based_wonky_points)*180//np.pi (array([1., 1., 1., 1.]), array([ 0., -63., -126., 170.])) np.angle(signal(np.array([0,1,2,3])*1.1)) * 180 // np.pi array([ 0., -162., 36., -127.]) np.angle(np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1])) * 180 // np.pi array([ 0., 99., -162., -64.])
It looks like it's quite possible to adjust the phase offsets of the results of multiplying a unit signal by a unit fourier component so as to recover destructive interference. In reality there are multiple signals at once, multiple components at once, and notable the signal does not have unit magnitude. A first test could be to give the signal a different magnitude, and see what the result looks like.
zero_based_wonky_points = signal(np.array([0,1,2,3])*1.1)*1.1 * np.exp(np.array([0,1,2,3]) * 1.1 * 2j * np.pi * np.fft.fftfreq(4)[1]) abs(zero_based_wonky_points), np.angle(zero_based_wonky_points) * 180 // np.pi (array([1.1, 1.1, 1.1, 1.1]), array([ 0., -63., -126., 170.]))
It looks like magnitude is retained intact, so that much is great. Then, real data would have more than one signal going on at once, and more than one fourier component going on at once. 0712 What does that mean? Well, for one thing I could try a further combination. I can imagine in my mind that changing the fourier frequency here, or the signal frequency, will just change the angle advancement, and can be similarly countered. What if there are multiple signals? Here, there is just a DC component and a single signal. It seems it could be reasonable to look briefly at 6-valued data, so as to consider the interference of 2 disaligned signals at once. It seems like that might require a more complex transform. The fftfreqs of 6-valued data have 3 different absolute frequencies in them, in addition to the DC offset. I imagine the negative frequencies have an arithmetic equivalence, but it's a little more confusing for me to consider what that is. 0714
goal: construct two different signals, that have frequencies aligned with the different frequencies in fftfreq(6) . then consider the product of multiplying these by a further third frequency, as if in a fourier transform, with 10% or such difference between the sampling rate and the signals. what is the resulting product? i infer it would be helpful to consider each signal separately, so as to visually see the formula for this sum and product. 0721 This was the previous signal function with a frequency of 0.5:
def signal(x): ... return np.exp(2j * np.pi * x * 0.5) ...
Here are the frequencies for 6-valued data:
np.fft.fftfreq(6) array([ 0. , 0.16666667, 0.33333333, -0.5 , -0.33333333, -0.16666667])
Here they are as degrees:
np.fft.fftfreq(6)*2*np.pi*180//np.pi array([ 0., 59., 119., -181., -120., -60.])
So, 0.5 is 180 degrees, and the two new frequencies are 60 and 120 degrees. I guess degrees here means what? umm looks like advancement-per-sample. So, degrees/sample . Maybe I'll make the signals be 60 degrees and 180 degrees and consider them together at 120 degrees. 0725
def signal_a(x): ... return np.exp(2j * np.pi * x * 60/360) ... def signal_b(x): ... return np.exp(2j * np.pi * x * 180/360) ... def signal_sum(x): ... return signal_a(x) + signal_b(x) ...
0726 ok, first let's try the signals in a sample-aligned manner.
sample_idcs = np.arange(6) aligned_products = [signal(sample_idcs) * np.exp(sample_idcs * 2j * np.pi * np.fft.fftfreq(6)[2]) for signal in (signal_a, signal_b, signal_sum)] [(abs(product), np.angle(product) * 180 // np.pi) for product in aligned_products] [(array([1., 1., 1., 1., 1., 1.]), array([ 0., 180., -1., 179., -1., 179.])), (array([1., 1., 1., 1., 1., 1.]), array([ 0., -61., -121., 179., 119., 59.])), (array([2., 1., 1., 2., 1., 1.]), array([ 0., -121., -61., 179., 59., 119.]))]
separating that output out: signal_a at 120 degres: array([1., 1., 1., 1., 1., 1.]) array([ 0., 180., -1., 179., -1., 179.]) signal_b at 120 degrees: array([1., 1., 1., 1., 1., 1.]) array([ 0., -61., -121., 179., 119., 59.]) signal_sum at 120 degrees: array([2., 1., 1., 2., 1., 1.]) array([ 0., -121., -61., 179., 59., 119.]) 0732 so first let's verify that signal_a and signal_b are understandable somehow signal_a advances by 60 deg/sample. signal_a at 120 degres: array([1., 1., 1., 1., 1., 1.]) array([ 0., 180., -1., 179., -1., 179.]) 60 + 120 = 180, so it makes sense the first sample is 180. The angle of -1 doesn't seem quite right, shouldn't it be 0.0?
aligned_products[0] array([ 1.+0.00000000e+00j, -1.+3.33066907e-16j, 1.-7.21644966e-16j, -1.+3.67394040e-16j, 1.-1.44328993e-15j, -1.+1.44328993e-15j])
The angle being output as -1 is a complex number of 1 + -0j . so this must just be an artefact of the angle conversion. Then it goes back to 180. So the output from signal_a is cycling at 180 degrees/sample because it is a 60 degree/sample signal harmonized with a 120 degree/sample component. Let's see signal_b's aligned output: signal_b at 120 degrees: array([1., 1., 1., 1., 1., 1.]) array([ 0., -61., -121., 179., 119., 59.]) signal_b moves at 180 degrees/sample, so harmonized with 120 degrees/sample, that makes 300 degrees/sample, which is -60+360 . The parts of the output of signal_b are all increasing by this 300 degrees/sample amount:
-60+300-360 -120 -120+300 180 180+300-360 120 120+300-360 60
Now, we get to make things a little more complex. What does signal_sum look like, and why? I'm thinking signal_sum's output should be arithmetically equivalent to the sum of the other outputs, not sure. signal_sum is the sum of a 60 degree/sample signal and a 180 degree/sample signal, harmonized with a 120 degree/sample component. signal_sum at 120 degrees: array([2., 1., 1., 2., 1., 1.]) array([ 0., -121., -61., 179., 59., 119.]) When the two components resonate, the magnitude hits 2.0 .Note that there are 2 total component signals here. That starting angle of -120. How does it compare to the sum of the previous advancing signals? signal_a was at 60 degrees with 120, so it advanced by 180. signal_b was 180 degrees with 120, so it advanced by 300. 300 + 180 = 480. what is 480?
480 - 360 120
That's _positive_ 120 degrees. What we have here is _negative_ 120 degrees.
-120 + 360 240
So it's not the sum of the angle advancements: it's actually twice that sum. 0740 [holy frack this is so hard to choose to do] 0741 Okay. The rates are summed when the data is _multiplied_. This data is summed. How does summing two complex numbers affect their angles? Alternatively, how do transforms on the underlying angles get affected by this suming? The first websearch result to stackoverflow says there is no clear relation between the angle of the complex sum and the angles of the summands other than the definition. The answer has only 1 upvote. I'm imagining in my mind two vectors aimed in different directions and summed, tip-to-tail. What I am able to do to these vectors, is rotate them further by multiplying by a complex constant. When I do this, both vector components do indeed rotate the same amount. So the idea of multiplying by a constant works when working with _only one_ component, even though the result is a sum. We should be able to verify that by making one of the components destructively interfere, leaving only the other. 0745 Okay, one is interfering at 180 degree/sample, the other is interfering at 300 degrees/sample, so if I unrotate it by 180 or by 300 degrees, it should leave only the other component in a resulting sum? i wonder if i could unrotate it a second time to make them both destructively interfere, is that meaningful or not meaningful? maybe the first one doesn't even work? I'll need some data spiraling at 180 or 300 degrees.
freq_180 = np.exp(sample_idcs * 2j * np.pi * 180/360) freq_300 = np.exp(sample_idcs * 2j * np.pi * 300/360)
When I divide by this, I get data that is itself spiraling at 300 (which -60) degrees:
[[np.abs(x), np.angle(x)*180//np.pi] for x in [aligned_products[-1] / freq_300]] [[array([2., 1., 1., 2., 1., 1.]), array([ 0., -61., 59., 0., -61., 59.])]] -60+360 300
I'm assuming the second instance of 300 degrees is a coincidence. If I sum this data, does the component that made 300 degrees cancel out? Or do I have sometihng wrong? I do have something wrong. This data is aligned.
np.sum(aligned_products[-1]/freq_300) (6.000000000000001-3.219646771412954e-15j) np.sum(aligned_products[-1]) (2.1649348980190553e-15+5.551115123125783e-16j)
The sum of rotating it all by a 300 degree/sample wave makes a sum of 6 exactly. Meanwhile, the original sum was 0 exactly (e-15 and e-16 are 0 in floating point math), because the data was aligned already. I've actually pulled a wave out. I guess I need to understand this a little better to succeed at it.
okay I actually took some _off-list notes_ (omigod right) to try to hold these different things in my mind at once the 300 degree/sample signal is already destructively interfering with itself, so unrotating it by 300 degree/sample removes the interference, making it aligned and turning it into a peak. The result is actually correct. If we were to instead unrotate by 180 degree/sample, we should see the other signal harmonize with itself and become a peak, summing to 6.0 because there are 6 samples. 0800
np.abs(removed_180), np.angle(removed_180)*180//np.pi (array([2., 1., 1., 2., 1., 1.]), array([ 0., 59., -61., -1., 59., -61.])) np.sum(removed_180) (5.999999999999999-1.1102230246251565e-15j)
This does indeed work. Given the environment is one of complete destructive self interference of spinning components, components can be isolated by further spinning at the rate that opposes them, and they stop destructively interfering. Somehow, the other components keep on destructively interfering. Why is it that the other components still destructively interfere after the rate change? And, how does this work if none of the components are destructively interfering? Thinking of the latter question, you'd indeed be able to produce a sum with one component removed, although others might be included. It's a little complex there. More challenges than I was hoping for. Comprehending it better could help me grapple with it. - I'm spending a little time skipping by the detailed comprehension and daydreaming on the larger part of the problem. We can consider these spirals as individual points in a larger-dimensioned space composed of complex spirals. Addition in this space would be multiplication of the spirals, kind of roughly. The importance is the use of it. If we have a spiral at 300 degrees, we can remove this from data to remove the 300 degree component .. actually i confused myself typing the space up a little, and i'm bopping back to the more detaile dunderstand .. [ basic idea is data contains many spirals. we can remove a spiral. this adjusts the other spirals. we can consider the parts of the problem in this space to simplify and summarise it. but it got somewhat corrupt. freq_300 and freq_180 would be components of this space: they add a 300 deg/sec and 180 deg/sec rotation to data summing is kind of an evaluation of a point in the space. it shows whether the signals combine or not. the fourier frequencies are all components in this space. we're considering the one at 120 deg/sec . so are the original signals. they are at 60 deg/sec and 180 deg/sec . when we sum the original signals, this operation unfortunately was lost in the transformation to language. the 60 deg/sec and 180 deg/sec components are combined with the 120 deg/sec component. this makes combined infomration that has its own deg/sec property, but has peaks and things that show it contains two underlying waves. This combination is formed by summation, before or after product. Since summation is [commutative? transitive?] with multiplication, operations on the unaccelerated waves can be considered identical to operations on the accelerated waves (by 120 deg/sec). Now, consider the final summation wave, which consists of both 60 deg/sec and 180 deg/sec components, both of them accelerated by 120 deg/sec to 180 deg/sec and 300 deg/sec . When we decelerate them by 300 deg/sec, this applies to both waves separately, although they are in a combined space. the 60 deg/sec wave is now at 180 - 300 = -120 deg/sec, which is 240 deg/sec accoridng to calculator the 180 deg/sec wave is not at 300 - 300 = 0 deg/sec. So it harmonizes! Meanwhile, the 240 deg/sec wave is self-interfering because of how 240 deg/sec aligns with the sampling count of 6. ] [ok that was quite helpful and simplifying in the end. it really slowed the idea down to type out all the space and then translate to normativeness of waves. i think spacially.]
trying to return to work {possible additional information is that cognitive and logical and [likely?] decision-tree concepts have similar properties like the distributive law and consideration as spaces.} ok so this thing is harmonizing when aligned because of 240 deg/s, which is -120 deg/s, for 6 samples. 6 samples of 120 deg/sample is [0, -120, -240, 0, -120, -240] which is [0, -120, 120, 0, -120, 120] That's why it self interferese: all its expanded angles oppose. I'm quickly thinking that given our data is accelerated or decelerated by a disaligned sampling rate, there may be some way to realign it all, but i haven't considered the sampling error in this way. We harmonized the 300 deg/sample data. The other data was 180 deg/sample, which self-interferes at 2 samples. After the change, it is now accelerating at -120 deg/sample, which self-interferes at 3 samples. Ok. Let's look at data that has been sampled at a rate that is off by 10%, now. Here's how the aligned accelerated signals were made:
sample_idcs = np.arange(6) aligned_products = [signal(sample_idcs) * np.exp(sample_idcs * 2j * np.pi * np.fft.fftfreq(6)[2]) for signal in (signal_a, signal_b, signal_sum)] [(abs(product), np.angle(product) * 180 // np.pi) for product in aligned_products] [(array([1., 1., 1., 1., 1., 1.]), array([ 0., 180., -1., 179., -1., 179.])), (array([1., 1., 1., 1., 1., 1.]), array([ 0., -61., -121., 179., 119., 59.])), (array([2., 1., 1., 2., 1., 1.]), array([ 0., -121., -61., 179., 59., 119.]))]
With sample rate off by 10%:
scaled_products = [signal(sample_idcs*1.1) * np.exp(sample_idcs * 1.1 * 2j * np.pi * np.fft.fftfreq(6)[2]) for signal in (signal_a, signal_b, signal_sum)] [(abs(product), np.angle(product) * 180 // np.pi) for product in scaled_products] [(array([1., 1., 1., 1., 1., 1.]), array([ 0., -162., 36., -127., 72., -91.])), (array([1., 1., 1., 1., 1., 1.]), array([ 0., -30., -60., -91., -120., -151.])), (array([2. , 0.81347329, 1.33826121, 1.90211303, 0.20905693, 1.73205081]), array([ 0., -96., -12., -109., 156., -121.]))]
Organizing the output: signal_a offsets scaled by 10% array([1., 1., 1., 1., 1., 1.]) array([ 0., -162., 36., -127., 72., -91.]) signal_b offsets scaled by 10% array([1., 1., 1., 1., 1., 1.]) array([ 0., -30., -60., -91., -120., -151.]) signal_sum offsets scaled by 10% array([2. , 0.81347329, 1.33826121, 1.90211303, 0.20905693, 1.73205081]) array([ 0., -96., -12., -109., 156., -121.]) We can see the -162=198 deg/s rate of signal_a, and the -30=330 deg/s rate of signal_b . When I think of scaling the sampling points of a signal, this seems roughly equivalent to scaling its rate by the same amount. It seems helpful to think of that individually.
np.angle(signal_a(sample_idcs)) * 180 // np.pi array([ 0., 59., 119., 180., -121., -60.]) np.angle(signal_a(sample_idcs*1.1)) * 180 // np.pi array([ 0., 66., 132., -162., -96., -31.]) np.angle(signal_a(sample_idcs)) * 180 * 1.1 // np.pi array([ 0., 66., 132., 198., -133., -67.])
Yes, scaling the indices, is the exact same operation as scaling the angles. So, I mean, simplifying all this a lot, it seems it would make sense to unscale the angles right off. I might be missing something! It's still inhibited to think about. But couldn't we take the output, and just decelerate it by 10%? To do that, I'd need a decelerating wave for 10% . Is that what waves do? The previous "acceleration" we were using added a constant to every increment. Multiplying by 10% does something different: it scales each angle the same amount. The amount it is accelerated by is different depending on what its rate already is. So, this may not preserve across addition and multiplication, since I haven't thought much about what it means yet, consciously. I tried multiplying the angles by 1.1 and 1/1.1 a few times but didn't get anywhere I understood. I'll go back to looking at it closely. 0831 signal_a offsets scaled by 10% array([1., 1., 1., 1., 1., 1.]) array([ 0., -162., 36., -127., 72., -91.]) This is advancing at -162 = 198 degrees. It's composed of a 60 deg/s wave sampled at 110% the rate, multiplied by a 120 deg/s wave sampled at 110% rate. At 100% sampling rate, these multiplied to make a 180 deg/s wave that self-interfered. Now, it's 198 deg/s wave can be calculated as 60*1.1 + 120*1.1 = 198 . Notably, 198 = 180 * 1.1 . So the sample rate scaling operation is preserved _after_ the multiplication, it distributes out. signal_b sample rate scaled to 110% array([1., 1., 1., 1., 1., 1.]) array([ 0., -30., -60., -91., -120., -151.]) signal_b was 180 deg/s . So, (180 + 120) = 300, and 300 * 1.1 = 330, which is -30 . It's advancing at 330 deg/s, because 300 deg/s * 1.1 = 330. It appears as -30 modulo 360. {It's a little nice at this time to connect with the larger goal of extracting high frequency signals from low frequency data. This operation of acceleration during harmoniation with a fourier wave differentiates between a -30 deg/s wave and a 330 deg/s wave, even though they are the same in a single snapshot. This similarity could help the whole idea unify some, but it so hard to think.} signal_sum sample rate scaled to 110% array([2. , 0.81347329, 1.33826121, 1.90211303, 0.20905693, 1.73205081]) array([ 0., -96., -12., -109., 156., -121.]) We should now be able to consider this resulting wave as a composition of the two underlying waves. There's a slow one at 198 deg/s, that is originally at 60 deg/s, was undersampled to 66 deg/s, and then harmonized with a 120 deg/s wave (undersampled to 132 deg/s) up to 198 deg/s . Then there's a fast one at 330 deg/s, that is originally at 180 deg/s, was undersampled to 198 deg/s, and then harmonized with the 120 or 132 deg/sec wave up to 330 deg/s. The signals are unaligned and are making noise. Further accelerations or decelerations could be applied to cause them to harmonize / constructively interfere, or desynchronize / destructively interfere them, making them sum to 6 or go away entirely. A sum would contain the other waves, as well. There's a space here where one could reconsider the fourier transform to be appropriate overall for the situation, by considering how it uses these various de/accelerations to completely synchronise or desynchronise components at the same time. For now, there are 6 samples, so things that self-interfere in up to 6 units work. The previous angles self-interfered in up to 3: that means 0 degrees (1 sample, 180 degrees (2 samples), or 120 degrees (3 samples). The original data is at 60 deg/s and 180 deg/s, and the original harmonizing wave was at 120 deg/s. We're trying to use that harmonizing wave to demonstrate that the original data was not at 120 deg/s at all, and was entirely at 0 deg/s, 120 deg/s, and/or 180 deg/s . So here's the relevent point of current confusion: given the angles are all scaled to 110%, can I realign them so that they rest at multiples of 120 and 180 deg/s ? 0847 signal_a
np.arange(6) * (60*1.1+120*1.1) array([ 0., 198., 396., 594., 792., 990.])
signal_b
np.arange(6) * (180*1.1+120*1.1) array([ 0., 330., 660., 990., 1320., 1650.])
They exist as a vector sum. Each sample in the sum can have its angle added to or subtracted from, and the changes to the angle will propagate back to the original wave's contribution. Multiplying the angle does not have the same effect, due to the modulo 360 behavior. 0852 It seems reasonable to consider the first sample, at 198 deg and 330 deg . 198 is 18 more than 180, the original angle, whereas 330 is 30 more than 300, the original angle. I'm looking at this a smidge more and seeing that it's not at 120 degrees that signal_b usually cancels out: it's at -60 degrees, which is 300:
freq_120 = np.exp(sample_idcs * 2j * np.pi * 120/360) np.angle(signal_b(np.arange(6))*freq_120)*180//np.pi array([ 0., -61., -121., 179., 119., 59.])
It takes all 6 samples for it to cancel at 300 degrees, as that 60 degree offset accumulates over the circle. The offsets for the samples taken at the high rate are 18 (198 - 180) and 30 (330 - 30). Since their magnitude is different, multiplication of them would be needed to return them to their individual values. So far, the only operation on angles I've found a way to preserve across addition of the signals, is addition of further angles. Given these 18 and 30 degree offsets are combined together, it doesn't seem like I'd be aligning the underlying waves in the first sample using simple addition of angles. - there may be a way to scale the angles - the fourier frequencies could be sampled at different points than the signals, to scale things that way, maybe - the data could be aligned in a different way such that cancellation still works - other options it got harder to think when i realised the fourier frequency scaling may work here. i'm not quite sure it does, especially after briefly trying it while typing this paragraph. the scaling comes out via distribution from the signal and fourier sampling. maybe it can work! the sample angles are scaled by 1.1 the fourier angles are scaled by 1.1 the fourier angles are then added to the sample angles. if i wanted to change the fourier angles so as to return the sample angles to zero, i would need to shrink them differently for each one, kind of like a + b = c * d, solve for b that undoes a d and depends on c. that doesn't sound like the approach to me, but i'm having some difficulty considering it. it seems analogous to the situation after the sample scaling. let's see about angles and frequencies. let's call the angular rate the frequency, so 180 deg/s is a frequency. this appears to be the same frequency passed to the exp function. we're multiplying the frequency of the signal by 110% then we're multiplying the frequency of the fourier wave by 110% then we're multiplying the two waves, which adds their frequencies. output_freq = sample_freq * 1.1 + fourier_freq * 1.1 the result is an output_freq that is * 1.1 . if i want to undo the * 1.1 by reducing the fourier_freq, it won't backpropagate to two different sample frequencies. it will instead depend on them. sample_freq + fourier_freq = sample_freq * 1.1 + X X = sample_freq + fourier_freq - sample_freq * 1,1 X = fourier_freq - sample_freq * 0.1 So, an individual component can be processed correctly by subtracting the fraction of its frequency proportional to the change in sampling rate. This works for only one component, so the other components would need some form of destructive interference. What happens if there are two sample frequencies? it's helpful to unify the wave operations. let's take this funny adjuster wave and put it into the more space concepts. X = fourier_freq - sample_freq * 0.1 this X is a new fourier_freq, a fourier wave adjusted for sample_freq * 1.1 it's hard to hold all these together. so we could have output_wave = (sample_1_wave & sample_2_wave) + fourier_wave where fourier_wave could be 120 Hz, or 120*1.1 = 132 Hz, or it could be de-adjusted for either sample_1 or sample_2 . All those options exist. We're considering adding the option of de-adjusting these waves, to the output, to look for recombinations where everything destructively interferes. The impact of these deadjustments applies to both waves. [0910?] It includes a subtraction by 10% of the original sample rate, so that each wave could individually return to its original frequency. [0912]. The waves are at 330 and 198 Hz, so the dejustments will be 30 and 18 Hz. So the output wave pairs would be 300 Hz, 198-30=168 Hz, and 330-18=312 Hz, and 180 Hz. Because the mutation provides for aligning the waves at 300 Hz and 180 Hz, it's important to consider. Being away of the pair values of 168 Hz and 312 Hz, also seem important. There may be some way in frequency space to resonate these small offsets associated with 10% of the various original frequencies, so that they themselves self-interfere destructively. It looks, however, like it could be more reasonable to consider a mutation of the fourier transform itself, to handle this situation. At this point, it seems this would reduce the number of parts to consider at once. - 0914 [0915] It seems the fourier transform basically scales everything such the output waves hit fractions of their rotations every sample. So if you have 6 samples, you want the output waves to engage 360/6 = 60 degree points of their phases, so as to self-interfere destructively. Where can that come from if the sample rate is scaled?
np.angle(signal_a(np.arange(6)*1.1))*180//np.pi array([ 0., 66., 132., -162., -96., -31.])
Here we can see a 60 degree signal being snapshot at 110% of its points. We don't actually have data from it at the 60 degree points. We have offset data. The new realization is that this data can be made to self-interfere, by using a harmonizing wave that is appropriately shifted.
un_60_10 = np.exp(np.arange(6) * (1 - 60/360 * 0.1) * 2j * np.pi) np.angle(signal_a(np.arange(6)*1.1) * un_60_10)*180//np.pi array([ 0., 59., 119., 179., -121., -61.])
This wave's speed is altered specifically for this precise wave of interest. The way it strikes it slightly differently at each sample, rotates all the samples back, to produce output that is as if the wave was sampled at the right rate. The problem is that the original wave was alongside many other waves. So this can be used to resonate the wave, or to make it self-interfere, but the other waves get garbled. [i'm having some trouble holding the idea of how aligning the other waves can be similar to the other fourier transform. i might be holding some contradictory assumptions in maybe a small part of my considering. for my consciousness it's like amnesia and blocking and stuff.] what does the fourier transform do to a single wave? overall, the fourier transform lets you extract waves by resonating them in ways that destructively interfere the others. it does this all at the same time, but can be broken apart to be considered in parts. so we would roughly want to resonate the wave while destructively interfering the others. what's notable is that we know the frequencies of all the waves. given we know the frequencies of all the waves, here we even know their phases, there should be a way to extract them precisely from sufficient quantities of data containing them summed in known proportions. it is of course possible that that isn't reasonable to do. so i'm thinking of, how can i extract this, when it is summed with something else? i found that when i hit it with this special wave, the other wave, that i'm not focusing on, will get accelerated a different way. here, since i happen to know the phase of the data, i could actually craft a destructive wave, and send it in, and the two would sum to 0. there is probably a way in a larger metaspace of waves, to craft a metawave that would do that regardless of phase. there may even be a normative way to do that, not sure. the kind of obvious way of observing this, is that given the 10% increase is a rational fraction, many frequencies can be added so that the original fourier destructiveness happens, with 10 overlaps through all the data. i think there is a better solution, though. but it may engage a generalisation of spaces like that x10 space. we could for example imagine infinite frequencies coming in smoothly.
np.angle(signal_a(np.arange(6)*1.1))*180//np.pi array([ 0., 66., 132., -162., -96., -31.]) un_60_10 = np.exp(np.arange(6) * (1 - 60/360 * 0.1) * 2j * np.pi) np.angle(signal_a(np.arange(6)*1.1) * un_60_10)*180//np.pi array([ 0., 59., 119., 179., -121., -61.])
np.angle(signal_b(np.arange(6)*1.1))*180//np.pi array([ 0., -162., 36., -127., 72., -91.]) un_180_10 = np.exp(np.arange(6) * (1 - 180/360 * 0.1) * 2j * np.pi) np.angle(signal_b(np.arange(6)*1.1) * un_180_10)*180//np.pi array([ 0., -181., 0., 179., 0., 179.])
There's signal a and signal b being separately aligned again. They're being aligned to kind of specific values. These values could be anything. The fourier transform makes a matrix of coefficients for different offsets and frequencies. Its goal is simply to do that resonance and deresonance. It needs enough points to do that. We can collect a lot of interesting points here, and combine them such that resonate. In fact, we could collect every frequency, and combine it such that it resonated. In the background of each would be some "noise" -- the other frequencies combined according to the adjustment offsets. It would be much, much smaller than the incoming data. It also has precise arithmetic properties that can be calculated. So there is an initial solution here that involves a feedback loop: each frequency could be resonated precisely, and this initial result could be assumed. Then, assuming that result, a destructive wave for the whole signal could be made, and sent back in. The remaining signal would show the error in the construction of the wave, and this error could be used to inform the solution with greater precision. This would be repeated ad infinitum until the precise solution is reached. Then, flattening that recursive approach would be the exact solution. I don't have the training to know how to do that readily, but it shows there is a form for a precise solution.
Further avenues on that approach could include an analysis of the noise to account for it immediately, or an anlysis of the recursive solution, which would involve discerning solving for the measurement correction given the error after combining with an imperfect destructive wave. Another approach is to consider the system as linear algebra. We write an equation for the signal given how we want to represent it, and simply solve for the coefficients of the matrix involved. 0933 So, a fourier transform models a signal as being made of as many sinusoids as it has samples. Here, we have 6 sinusoids. Mostly because the data is complex; only half as many are needed for real-valued data. Each of these 6 sinusoids has a real and a complex component. It's written like exp(2j * pi * freq * sample_point). There are 6 frequencies, and 6 sample points, so this makes a 6x6 matrix. This should somehow make a system of equations that has 6 unknowns. I'm thinking the 6x6 matrix has the evaluation of the sin and cosine at each row (or column), spread out for each different sample. When you apply the fourier transform, you multiply a matrix like this by the signal. But rather here, the matrix is the data. More like an inverse fourier transform, the inverse wave is the variables here, and we set it equal to the data. data = F @ x We can then solve that linear system of equations for x, and x becomes the frequency representation. This linear equation should work for any selection of frequencies, and any selection of wavelets. So much simpler! Man I was asking my teachers to teach me this in high school, but it never happened. I never figured out what classes to take in college to learn this either. I dunno. But I bet that this is how wavelets are normally cast in engineering. I _think_ this should work. And it explains the validity of taking the matrix inverse later. Next step is to try it.
[it's a big accomplishment for me here to find a solution. my reason for pursuing this is to work on my inhibition around completing novel algorithms. i hope to take this further, and also to write many other novel algorithms. i'm stepping away for now to do other things.]
ok, so let's try the matrix inverse approach to making a fourier transform, first with a normal transform there are n frequencies, and each of these is evaluated at n sample offsets
freqs = np.fft.fftfreq(6) freqs array([ 0. , 0.16666667, 0.33333333, -0.5 , -0.33333333, -0.16666667]) mat = np.exp(2j * np.pi * np.outer(np.arange(6), freqs)) np.abs(mat), np.angle(mat)*180//np.pi (array([[1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.]]), array([[ 0., 0., 0., 0., 0., 0.], [ 0., 59., 119., -181., -120., -60.], [ 0., 119., -121., 0., 120., -120.], [ 0., 180., -1., -180., 0., -181.], [ 0., -121., 119., 0., -120., 120.], [ 0., -61., -121., -180., 120., 60.]]))
To transform something from frequency space to the time domain, we would have a vector of magnitudes, each one applying to each frequency. each sample is generated from all the frequencies, multiplied by all the magnitudes, summed. np.outer turns its arguments into dimensions in the order they are passed. so the first dimension of the matrix (the larger dimension), here, is going to be samples, whereas the second dimension of the matrix (the smaller dimension), here, is going to be frequencies, simply because I happened to pass them in that order. I guess I like post-multiplying my matrices, since in my culture information flows from left to right, so maybe i'll swap the order if that's not what ends up happening. The larger dimension is samples. The smaller is frequencies. If I have a vector of data, and matrices are multiplied rows x columns, I guess my vector would be a row, and the whole thing would be multiplied by every column, and summed. So if the matrix is on the right, the columns will be summed across. Uhhh that would be larger dimension. The samples. So I think I want to transpose the matrix.
mat = np.exp(2j * np.pi * np.outer(freqs, np.arange(6)))
Now passing the frequencies first, I think they'll be summed across. Not sure. So maybe I can make a single wave of data if I pass a 1 for the 0.5 frequency, and 0s everywhere else ...
freqs array([ 0. , 0.16666667, 0.33333333, -0.5 , -0.33333333, -0.16666667])
(np.array([0,0,0,1,0,0]) @ mat).real array([ 1., -1., 1., -1., 1., -1.])
ok, that didn't go as I expected. it gave me nyquist data. oops. okay, that makes sense now that I think about frequencies. 0.5 is 180 degrees/sample. I want the 0.16666 entry to have a full wave.
(np.array([0,1,0,0,0,0]) @ mat).real array([ 1. , 0.5, -0.5, -1. , -0.5, 0.5])
There we go. And inverting the matrix should be its transpose ...?
np.angle(np.linalg.inv(mat)) * 180 // np.pi array([[ 0., 0., 0., -1., -1., -1.], [ 0., -61., -120., 179., 120., 59.], [ 0., -121., 120., -1., -121., 119.], [ 0., -180., 0., 180., -1., 180.], [ -1., 120., -120., -1., 119., -121.], [ -1., 60., 120., 180., -121., -61.]])
np.abs(np.linalg.inv(mat)) array([[0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667], [0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667], [0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667], [0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667], [0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667], [0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667]])
It's transposed, and also scaled by 1/6, which doesn't mean too much. Okay! Let's see if it can process data that is sampled 10% too slowly ;p 1635
mat = np.exp(2j * np.pi * np.outer(freqs, np.arange(6)*1.1))
My first question is whether the matrix inverts ...!
mat2freq = np.linalg.inv(mat)
It worked fine !! And its magnitude is 1/6?
abs(mat2freq) array([[0.12780629, 0.15711185, 0.33061418, 0.33061418, 0.15711185, 0.12780629], [0.19271258, 0.16807326, 0.15711185, 0.15711185, 0.16807326, 0.19271258], [0.18145366, 0.19271258, 0.12780629, 0.12780629, 0.19271258, 0.18145366], [0.18145366, 0.19271258, 0.12780629, 0.12780629, 0.19271258, 0.18145366], [0.19271258, 0.16807326, 0.15711185, 0.15711185, 0.16807326, 0.19271258], [0.12780629, 0.15711185, 0.33061418, 0.33061418, 0.15711185, 0.12780629]])
No, it is full of wonky differently-lengthed vectors to do the funny rate scaling. Ok. Better try using it. 1637 [afk, returned 1643] So, here's slowed data from the mat forwardpass:
(np.array([0,1,0,0,0,0]) @ mat).real array([ 1. , 0.40673664, -0.66913061, -0.95105652, -0.10452846, 0.8660254 ])
And of course we can invert it (I hope): array([-1.77253931e-09, 5.00000001e-01, -7.02464406e-11, 1.59520641e-09, -7.10239902e-10, 5.00000000e-01]) It made it into complex data because I had taken only the real component. it has 0.5 as a positive wave and 0.5 as a negative wave. So now we have that complex frequency data, we can unscale the sampling using a normal ifft?
np.fft.ifft(np.array([-1.77253931e-09, 5.00000001e-01, -7.02464406e-11, 1.59520641e-09, -7.10239902e-10, 5.00000000e-01])) array([ 0.16666667+0.00000000e+00j, 0.08333333+2.36712649e-10j, -0.08333333+5.19624660e-11j, -0.16666667+1.48285615e-18j, -0.08333333-5.19624660e-11j, 0.08333333-2.36712651e-10j])
No ... it needs the complex components I trimmed. Okay, lemme try again.
test_freqs = np.array([0,1,0,0,0,0]) test_data = (test_freqs @ mat).real test_data array([ 1. , 0.40673664, -0.66913061, -0.95105652, -0.10452846, 0.8660254 ]) analysed_freqs = test_data @ mat2freq rescaled_data = np.fft.ifft(analysed_freqs) rescaled_data array([ 0.16666667-1.44560290e-17j, 0.08333333+4.04768811e-18j, -0.08333333+3.14030088e-18j, -0.16666667+9.91313139e-18j, -0.08333333+5.12143680e-17j, 0.08333333+2.59378204e-17j]) rescaled_data.real array([ 0.16666667, 0.08333333, -0.08333333, -0.16666667, -0.08333333, 0.08333333])
It did successfully unscale the data, although its final magnitude was 1/6th the original magnitude. I'm excited that this can be used to make a general fourier function. Missing parts include: - rectangular transformation, input and output samplecount differing - resuming of a transform, propagating the phase forward With resuming, I'm thinking that can probably be done by simply multiplying a constant vector to the matrix, to shift it. Not quite sure how, but visible from previous stuff. [1653] 1702 Here's my draft fourier.py (License: AGPL-3): import numpy as np def create_freq2time(freq_rate, time_rate, freq_count, time_count): freqs = np.fft.fftfreq(freq_count) offsets = np.arange(freq_count) * freq_rate / time_rate mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) return mat / freq_count # scaled to match numpy convention def create_time2freq(time_rate, freq_rate, time_count, freq_count): if time_count != freq_count: raise NotImplementedError("differing input and output samplecounts") forward_mat = create_freq2time(freq_rate, time_rate, freq_count, time_count) reverse_mat = np.linalg.inv(forward_mat) return reverse_mat * freq_count * freq_count # scaled to match numpy convention I usually make mistakes. 1703 So, in theory I can use this to extract that random data that was resampled and repeated. I'll likely run into issues, so to handle my funniness, I'll maybe do that in stages, and write tests that run through the motions slowly. Here it is passing a test that it can produce the same results as np.fft. I had [1709] to remove the scaling from the reverse_mat, which was already descaled by the inverse operation. import numpy as np def create_freq2time(freq_rate, time_rate, freq_count, time_count): freqs = np.fft.fftfreq(freq_count) offsets = np.arange(freq_count) * freq_rate / time_rate mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) return mat / freq_count # scaled to match numpy convention def create_time2freq(time_rate, freq_rate, time_count, freq_count): if time_count != freq_count: raise NotImplementedError("differing input and output samplecounts") forward_mat = create_freq2time(freq_rate, time_rate, freq_count, time_count) reverse_mat = np.linalg.inv(forward_mat) return reverse_mat def test(): randvec = np.random.random(16) ift16 = create_freq2time(1, 1, 16, 16) ft16 = create_time2freq(1, 1, 16, 16) randvec2time = randvec @ ift16 randvec2freq = randvec @ ft16 randvec2ifft = np.fft.ifft(randvec) randvec2fft = np.fft.fft(randvec) assert np.allclose(randvec2ifft, randvec2time) assert np.allclose(randvec2fft, randvec2freq) if __name__ == '__main__': test() It's pleasant to have this basic test pass. It's a small, solid-seeming unit. 1709 1718 my test regarding scaling is failing. maybe step through it and take notes, to sort it out. having some larger inhibition, so sending this. 1720
ok, to debug this, i gotta understand it again !!!! i'll seed numpy's random number generator so the numbers are deterministic. 16 B def test(): 17 -> np.random.seed(0) 18 randvec = np.random.random(16) (Pdb) n
/shared/src/scratch/fourier.py(18)test() -> randvec = np.random.random(16) (Pdb) /shared/src/scratch/fourier.py(19)test() -> ift16 = create_freq2time(1, 1, 16, 16) (Pdb) p randvec array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.0871293 ])
The failing test is treating randvec as a frequency vector, so I think that's a 0.5488 magnitude DC offset. 27 28 # sample data at a differing rate 29 B time_rate = np.random.random() * 2 30 freq_rate = 1.0 31 freqs = np.fft.fftfreq(len(randvec)) 32 -> rescaling_ift = create_freq2time(freq_rate, time_rate, len(randvec), len(randvec)) 33 rescaling_ft = create_time2freq(time_rate, freq_rate, len(randvec), len(randvec)) (Pdb) p time_rate 0.04043679488065144 Oh that's very slow. It's interesting that it should still be able to extract the wave, since it's all simulated math. I wonder what that means for physical systems: are signals _really_ made of _precise_ sinusoids? If they are, you should be able to extract them at a samplerate that is 2000% wrong, with only a small number of samples! I guess you'd need incredibly good measuring equipment. So I'm going to step into freq2time and look at how the matrix is generated. Following lines in the code generate the data manually, and that is a big candidate for mistakes, since it's otherwise untested. (Pdb) list 1 import numpy as np 2 3 def create_freq2time(freq_rate, time_rate, freq_count, time_count): 4 freqs = np.fft.fftfreq(freq_count) 5 offsets = np.arange(freq_count) * freq_rate / time_rate 6 -> mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) 7 return mat / freq_count # scaled to match numpy convention 8 9 def create_time2freq(time_rate, freq_rate, time_count, freq_count): 10 if time_count != freq_count: 11 raise NotImplementedError("differing input and output samplecounts") (Pdb) p freqs array([ 0. , 0.0625, 0.125 , 0.1875, 0.25 , 0.3125, 0.375 , 0.4375, -0.5 , -0.4375, -0.375 , -0.3125, -0.25 , -0.1875, -0.125 , -0.0625]) (Pdb) p offsets array([ 0. , 24.72995209, 49.45990418, 74.18985626, 98.91980835, 123.64976044, 148.37971253, 173.10966462, 197.83961671, 222.56956879, 247.29952088, 272.02947297, 296.75942506, 321.48937715, 346.21932924, 370.94928132]) Since the time samplerate is very low, it makes sense that the offsets used to evaluate the frequencies are very large. That much looks right. Time to think about the matrix. The frequencies are the first dimension (the larger one), and the offsets are the second dimension (the smaller one). So the frequencies will be the .... rows. And the offsets will be the ... columns? I think? Yes, moving inside the row is the smaller dimension, which is the columns. So visually, looking at the mat, it will be laid out like the actual signal, with time going from left to right, and the vertical parts will be the frequencies, with DC at the top. (Pdb) n
/shared/src/scratch/fourier.py(7)create_freq2time() -> return mat / freq_count # scaled to match numpy convention (Pdb) p mat / freq_count array([[ 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j , 0.0625 +0.j ], [ 0.0625 +0.j , -0.05994975-0.01767137j, 0.05250712+0.03390062j, -0.04077949-0.04736331j, 0.02572393+0.05696077j, -0.0085691 -0.06190978j, -0.00928505+0.06180645j, 0.02638146-0.05665923j, -0.04132493+0.04688816j, 0.05289596-0.03329065j, -0.06015026+0.01697635j, 0.06249581+0.00072336j, -0.05974121-0.01836403j, 0.05211125+0.03450605j, -0.04022859-0.04783211j, 0.02506296+0.05725468j],
There's the first row of DC components, and the second row handling the first frequency, which is 0.0625, or 1/16th of a cycle/sample. I'm immediately noticing that I divided by freq_count. I don't believe I did that in my comparison code. What does it mean, here? This is a matrix for converting from frequency space, to time space. When rendering a wave, this matrix is multiplied. That means summing all the frequency components. So, if I am rendering a wave manually, I would multiply all the sinusoids by randvec, and then take the _average_ of the result, not the sum. Because of the numpy convention of storing spectrums scaled by the sample count. Since my random number generator is seeded to 0 now, I can halt the debugging, add the change, and resume. It passes now. The functions didn't need any changes, I don't believe. Here's the new test: # sample data at a differing rate time_rate = np.random.random() * 2 freq_rate = 1.0 freqs = np.fft.fftfreq(len(randvec)) rescaling_ift = create_freq2time(freq_rate, time_rate, len(randvec), len(randvec)) rescaling_ft = create_time2freq(time_rate, freq_rate, len(randvec), len(randvec)) rescaled_time_data = np.array([ np.mean([ randvec[freqidx] * np.exp(2j * np.pi * freqs[freqidx] * sampleidx / time_rate) for freqidx in range(len(randvec)) ]) for sampleidx in range(len(randvec)) ]) assert np.allclose(rescaled_time_data, randvec @ rescaling_ift) unscaled_freq_data = rescaled_time_data @ rescaling_ft unscaled_time_data = unscaled_freq_data @ ift16 assert np.allclose(unscaled_freq_data, randvec) assert np.allclose(unscaled_time_data, randvec2time) So, next would be trying to extract my funny-sampled random data. I am hoping it can be exactly recovered!
it's so different to be in a state of mind of having accomplished something, rather than working on it it's pretty clear that it should be able to recover the random data. if it doesn't, it would be due to a mistake, not due to a faulty idea. it's not clear that it can be used to do useful physical world recording (e.g. recording something out-of-band of the recording device). the math assumes point sampling. but it seems pretty clear that it could be fudge to in some way, if it doesn't, possibly by modeling the sampling process better and/or amplifying the source of the signal.
but i need to get this far a lot! and get farther! my longer term project was the flat_tree thing for random-write append-only media. now that i've successfully made some forensic math, maybe i can preserve all the trash or whatever using that. here's fourier.py for finding later. next step would be adding streaming, so that it preserves phase changes when re-applied to a following chunk, accumulating data accurately. that just means multiplying by an array that has components with appropriate complex phases.
I was lucky and ran into these functions: - np.linalg.pinv - np.linalg.solve - np.linalg.lstsq The .solve and .lstsq functions are faster and more accurate than the .inv and .pinv functions. The .pinv and .lstsq functions compute the minimal least squares solutions to equations involving noninvertible matrices, falling back to exact solutions for invertible matrices. -- I was thinking a little of streaming, and how rows of a matrix could be processed in chunks, and summed into the output, for the exact same result. fourier.py still at https://lists.cpunks.org/pipermail/cypherpunks/2022-November/106719.html
given this project is [easier than flat_tree], i might try it a little longer, unsure. maybe i can patch fourier.py into that random data test, and then maybe look at my fan noise!
---- Vivisected Cyborg Zombie Torturer roams the pretty field, brandishing a ruler that looks like it may have come from a stylistic horror movie. Cyborg Torturer: "These flower dimensions will suffer." Vivisected Cyborg Zombie Torturer shambles up to a flower. As they move, their mechanized joints whirr and sputter, and puss and blood oozes from their injuries. They move thier ruler to a floewr, and make a measurement. Cyborg Torturer: "I measure you for 3.4 centimeters, flower! 3.4 centimeters exactly!" The flower cowered in fear.
Flower [being measured]: "Do not torture me! I have a family! This hurts so much!" Cyborg Torturer: "Torture is what is right. There is nothing I can do about this. Scream louder, flower!" Cyborg Torturer noted "3.4 centimeters" in their notebook, and took out an unearthly digital camera to photograph the flower.
replying to this to find fourier.py easily. On 11/13/22, Undescribed Horrific Abuse, One Victim & Survivor of Many <gmkarl@gmail.com> wrote:
I was lucky and ran into these functions: - np.linalg.pinv - np.linalg.solve - np.linalg.lstsq
The .solve and .lstsq functions are faster and more accurate than the .inv and .pinv functions. The .pinv and .lstsq functions compute the minimal least squares solutions to equations involving noninvertible matrices, falling back to exact solutions for invertible matrices.
--
I was thinking a little of streaming, and how rows of a matrix could be processed in chunks, and summed into the output, for the exact same result.
fourier.py still at https://lists.cpunks.org/pipermail/cypherpunks/2022-November/106719.html
it's hard to look at all the parts of the test code before the matrix approach maybe i can pull out juts the test data there was a lot of references to graphics too ...0827 i'm working in a new file 0833 i'm kind of funny. things are funny. notes debugging new file seeding random to 0, set size to 16 for debuggin (Pdb) list 11 12 waveform_N = 16 #256 #16 13 recording_N = 16 #256 14 waveform = np.random.random(waveform_N) 15 max_period = np.random.random() * waveform_N // recording_N ** 0.5 * 2 16 -> sample_idcs = np.arange(recording_N) / recording_N * max_period 17 recording = sample_sinusoids_funny(waveform, sample_idcs, max_period) 18 19 waveform_freq_2_recording_time = fourier.create_freq2time(max_period, recording_N, waveform_N, recording_N) 20 waveform_freq_reconstructed = np.linalg.solve(waveform_freq_2_recording_time, recording) 21 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) (Pdb) p waveform array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.0871293 ]) (Pdb) p max_period 0.0 The theory is that max_period is the length that each instance of "waveform" takes up within "recording". "waveform" is a repeating signal to extract the spectrum of. max_period should not be 0. that is a definite glitch. i changed its fudge (Pdb) list 11 12 waveform_N = 16 #256 #16 13 recording_N = 16 #256 14 waveform = np.random.random(waveform_N) 15 max_period = 1 + np.random.random() * (waveform_N - 1) / (recording_N - 1) ** 0.5 * 2 16 -> sample_idcs = np.arange(recording_N) / recording_N * max_period 17 recording = sample_sinusoids_funny(waveform, sample_idcs, max_period) 18 19 waveform_freq_2_recording_time = fourier.create_freq2time(max_period, recording_N, waveform_N, recording_N) 20 waveform_freq_reconstructed = np.linalg.solve(waveform_freq_2_recording_time, recording) 21 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) (Pdb) p waveform array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.0871293 ]) (Pdb) p max_period 1.1566110331467683 looks like I am calculating sample_idcs wrongly (0837) (Pdb) n
/shared/src/scratch/test2_upsamplish.py(17)<module>() -> recording = sample_sinusoids_funny(waveform, sample_idcs, max_period) (Pdb) p sample_idcs array([0. , 0.07228819, 0.14457638, 0.21686457, 0.28915276, 0.36144095, 0.43372914, 0.50601733, 0.57830552, 0.65059371, 0.7228819 , 0.79517009, 0.86745827, 0.93974646, 1.01203465, 1.08432284])
sample_idcs should be the points at which to evaluate waveform, to simulate it being sampled into recording. the numbers should increase more rapidly than y=x, but they are increasing less, so i must have an inverted ratio for it. 16 sample_idcs = np.arange(recording_N) / recording_N * max_period oops ! uhhh it goes to recording_N . it will end up going higher than that when done right. each max_period in recording_N, we want it to hit waveform_N . i'm trying this: sample_idcs = np.arange(recording_N) / max_period * waveform_N
the comparison is failing, [although the sample_idcs look right now,] so given fourier.py passes its internal tests, the difference must lie in how waveform is being sampled compared to the assumptions that fourier.py is making 0843 i can go into both functions and again examine the first few data items 7 def sample_sinusoids_funny(data, fractional_indices, max_period): 8 -> return np.array([ 9 ((np.exp(2j * sample_idx * np.pi * np.fft.fftfreq(len(data)))) * np.fft.fft(data)).sum() / len(data) for sample_idx in fractional_indices 10 ]) (Pdb) p data array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.0871293 ]) (Pdb) p fractional_indices array([ 0. , 13.83351839, 27.66703678, 41.50055518, 55.33407357, 69.16759196, 83.00111035, 96.83462875, 110.66814714, 124.50166553, 138.33518392, 152.16870232, 166.00222071, 179.8357391 , 193.66925749, 207.50277589]) i'm struggling .. i'm going to skip into the fourier.py data 4 def create_freq2time(freq_rate, time_rate, freq_count, time_count): 5 freqs = np.fft.fftfreq(freq_count) 6 offsets = np.arange(freq_count) * freq_rate / time_rate 7 -> mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) 8 return mat / freq_count # scaled to match numpy convention (Pdb) p freq_rate / time_rate 0.07228818957167302 (Pdb) p offsets array([0. , 0.07228819, 0.14457638, 0.21686457, 0.28915276, 0.36144095, 0.43372914, 0.50601733, 0.57830552, 0.65059371, 0.7228819 , 0.79517009, 0.86745827, 0.93974646, 1.01203465, 1.08432284]) huh. it's using the old shorter offsets (Pdb) p freq_rate, time_rate (1.1566110331467683, 16) the rates are samples / unit, I think. time here is recording, and freq is waveform. so it's saying the recording goes at 16 samples / unit . the unit must be the whole recoridng. the waveform frequency would then be 16 / max_period * 16, I guess. i'm just passing max_period. i decided 1 recording sample is 1 time unit which simplifies the expressions. waveform_rate = waveform_N / max_period. recording_rate = 1 the offsets in create_freq2time are now the same as the fractional indices in the sampler. 0852 here's [two rows of] the resulting matrix before it's scaled. these would be the DC frequency multipliers, and then the 1st frequency's multipliers: (Pdb) p mat[:2] array([[ 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j , 1. +0.j ], [ 1. +0.j , 0.65940045-0.75179189j, -0.13038209-0.99146382j, -0.83134847-0.55575149j, -0.96600102+0.25853825j, -0.44261455+0.89671197j, 0.38228055+0.92404631j, 0.94676649+0.32192113j, 0.86631595-0.49949643j, 0.19573177-0.98065747j, -0.60818472-0.79379553j, -0.99780632-0.06620079j, -0.70772316+0.70648987j, 0.06446038+0.99792027j, 0.79273357+0.60956828j, 0.98099736-0.19402106j]]) as i do this i find earlier assertions i can add that simplify the problem. 0854 (Pdb) n
/shared/src/scratch/test2_upsamplish.py(21)<module>() -> waveform_freq_2_recording_time = fourier.create_freq2time(waveform_rate, recording_rate, waveform_N, recording_N) (Pdb) n /shared/src/scratch/test2_upsamplish.py(22)<module>() -> assert np.allclose(np.fft.fft(waveform) @ waveform_freq_2_recording_time, recording) (Pdb) n /shared/src/scratch/test2_upsamplish.py(23)<module>() -> waveform_freq_reconstructed = np.linalg.solve(waveform_freq_2_recording_time, recording)
It passed the first assertion. The matrix returned from fourier.py produces the same sampled recording as the sampler function. So it's all logical errors in running the test. 0855
oh :D I need to transpose the matrix when passing it to np.linalg.solve , because np.linalg.solve does Ax = b right-to-left, not xA = b left-to-right like I have been doing. 0857 . Now it finishes and produces the exact random data, and it's just like the fourier.py test . The original test I had written had a smaller waveform size than recording size (16 vs 256). If I want to process the recording all at once, this could mean a rectangular matrix. Maybe I'll just try doing that, and see how it goes. 0900 I bumped the recording size back to 256, and changed np.linalg.solve to np.linalg.lstsq and made no other changes, and got this: (Pdb) cont Traceback (most recent call last): File "/usr/local/lib/python3.10/pdb.py", line 1726, in main pdb._runscript(mainpyfile) File "/usr/local/lib/python3.10/pdb.py", line 1586, in _runscript self.run(statement) File "/usr/local/lib/python3.10/bdb.py", line 597, in run exec(cmd, globals, locals) File "<string>", line 1, in <module> File "/shared/src/scratch/test2_upsamplish.py", line 22, in <module> assert np.allclose(np.fft.fft(waveform) @ waveform_freq_2_recording_time, recording) File "<__array_function__ internals>", line 200, in allclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2270, in allclose res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)) File "<__array_function__ internals>", line 200, in isclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2380, in isclose return within_tol(x, y, atol, rtol) File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2361, in within_tol return less_equal(abs(x-y), atol + rtol * abs(y)) ValueError: operands could not be broadcast together with shapes (16,) (256,) It's interesting, it looks like the failure is inside numpy. It's actually inside the allclose call. I guess I can figure this out. (Pdb) list 17 recording = sample_sinusoids_funny(waveform, sample_idcs, max_period) 18 19 recording_rate = 1 20 waveform_rate = (waveform_N / max_period) * recording_rate 21 waveform_freq_2_recording_time = fourier.create_freq2time(waveform_rate, recording_rate, waveform_N, recording_N) 22 -> assert np.allclose(np.fft.fft(waveform) @ waveform_freq_2_recording_time, recording) 23 waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) 24 waveform_freq_np = np.fft.fft(waveform) 25 assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) 26 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) 27 assert np.allclose(waveform, waveform_reconstructed) (Pdb) p recording.shape (256,) (Pdb) p (np.fft.fft(waveform) @ waveform_freq_2_recording_time).shape (16,) Looks like I have my dimensions sideways when they are not equal, in fourier.py . (Pdb) p waveform_freq_2_recording_time.shape (16, 16) No ... looks like I am using a frequency dimension where I should be using a time dimension. On to fourier.py ! 0903 # AGPL-3 Karl Semich 2022 import numpy as np def create_freq2time(freq_rate, time_rate, freq_count, time_count): freqs = np.fft.fftfreq(freq_count) offsets = np.arange(freq_count) * freq_rate / time_rate mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) return mat / freq_count # scaled to match numpy convention It's making a square matrix, and I want a rectangular one. Here it is: offsets = np.arange(freq_count) * freq_rate / time_rate I'll need as many sample offsets as there are samples in the time domain, to generate them. 0904 That change passed that assertion, and now it is failing with np.lstsq . This might mean I get to engage more theory to do it better! (Pdb) cont /shared/src/scratch/test2_upsamplish.py:23: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`. waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) Traceback (most recent call last): File "/usr/local/lib/python3.10/pdb.py", line 1726, in main pdb._runscript(mainpyfile) File "/usr/local/lib/python3.10/pdb.py", line 1586, in _runscript self.run(statement) File "/usr/local/lib/python3.10/bdb.py", line 597, in run exec(cmd, globals, locals) File "<string>", line 1, in <module> File "/shared/src/scratch/test2_upsamplish.py", line 25, in <module> assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) File "<__array_function__ internals>", line 200, in allclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2270, in allclose res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)) File "<__array_function__ internals>", line 200, in isclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2363, in isclose x = asanyarray(a) ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part. I don't really know what this failure means. 0906 0907 oh, it's actually failing within another allclose call. the lstsq call completed successfully. 23 waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) 24 waveform_freq_np = np.fft.fft(waveform) 25 -> assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) 26 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) (Pdb) p waveform_freq_reconstructed.shape *** AttributeError: 'tuple' object has no attribute 'shape' (Pdb) p waveform_freq_reconstructed (array([ 9.13008935-2.05511125e-15j, -1.13139897-1.72422981e-01j, 0.11932716-6.18590050e-01j, -0.47619445-1.42077247e+00j, 0.60106394-6.17441233e-01j, 0.4321097 -8.80843476e-01j, 0.92222665+1.15314024e+00j, -0.4839133 -2.10053038e-01j, -0.31551473-6.30467900e-14j, -0.4839133 +2.10053038e-01j, 0.92222665-1.15314024e+00j, 0.4321097 +8.80843476e-01j, 0.60106394+6.17441233e-01j, -0.47619445+1.42077247e+00j, 0.11932716+6.18590050e-01j, -1.13139897+1.72422981e-01j]), array([2.16596661e-24]), 16, array([1.03318085, 1.03318084, 1.03317917, 1.03309466, 1.03124288, 1.01732177, 0.99161519, 0.98123564, 0.9801968 , 0.98016181, 0.98016142, 0.98016142, 0.98016142, 0.98016142, 0.98016142, 0.98016142])) The frequencies look correct, but it's returning a tuple rather than a matrix. It looks like it's showing that the need to process the return value of lstsq correctly. It's exciting there's more information than just the least squares solution. As I was adding it, I was wondering how I would discern error efficiently.
help(np.linalg.lstsq)
Returns ------- x : {(N,), (N, K)} ndarray Least-squares solution. If `b` is two-dimensional, the solutions are in the `K` columns of `x`. residuals : {(1,), (K,), (0,)} ndarray Sums of squared residuals: Squared Euclidean 2-norm for each column in ``b - a @ x``. If the rank of `a` is < N or M <= N, this is an empty array. If `b` is 1-dimensional, this is a (1,) shape array. Otherwise the shape is (K,). rank : int Rank of matrix `a`. s : (min(M, N),) ndarray Singular values of `a`. It returns x, residuals, rank, s. So it found the right solution, the error is 2e-24 which is very low, there are 16 data elements, and it also gives 16 singular values. I'm not sure what a singular value is, although I think I learned once, but it sounds a little important. What's immediately interesting is the solution and the error. In a larger work, the error could indicate background signals that have not been modeled, maybe. draft saved at 0911 am. It works now, using np.lstsq to exactly find the 16 random samples repeated within a 256 sample long recording. A further avenue might be to add background noise and see how the noise magnitude relates to the recording duration needed to completely remove the noise, and the error reported by the solution. Ideally you'd figure this in theory, and then verify by testing. A different further avenue might be to do it while blind to the repeating frequency of the waveform in the data. (note: it is the only signal here). Attached is fourier.py with the dimension bug fixed.
Here's what I have right now for fftfreq. I'm excited to have factored fftfreq out and added optional minimum and maximum frequency bounds. The rest of fourier.py is 1 email back. This interface does not facilitate the usecase of having minimum and maximum frequency bounds and simply desiring as many freq_count and freq_sample_rate as makes sense with the bounds. Thinking: the purpose of freq_sample_rate was to test against data that had been generated at sampled at a specific different rate, and was occurring in a larger recording. But maybe it would be more appropriate to consolidate that information into freq_count, min_freq, and max_freq, given in real recordings there is usually only one time base. # AGPL-3 Karl Semich 2022 import numpy as np def fftfreq(freq_count, sample_rate = None, min_freq = None, max_freq = None, dc_offset = True, freq_sample_rate = None): if sample_rate is None and freq_sample_rate is None: sample_rate = 1 freq_sample_rate = 1 elif sample_rate is None: sample_rate = freq_sample_rate elif freq_sample_rate is None: freq_sample_rate = sample_rate if not dc_offset: freq_count += 1 if min_freq is None: min_freq = freq_sample_rate / freq_count min_freq /= sample_rate if freq_count % 2 == 0: if max_freq is None: max_freq = freq_sample_rate / 2 max_freq /= sample_rate neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count//2, endpoint=True) pos_freqs = -neg_freqs[:0:-1] else: if max_freq is None: max_freq = freq_sample_rate * (freq_count - 1) / 2 / freq_count max_freq /= sample_rate pos_freqs = np.linspace(min_freq, max_freq, num=freq_count//2, endpoint=True) neg_freqs = -pos_freqs[::-1] return np.concatenate([ np.array([0] if dc_offset else []), pos_freqs, neg_freqs ]) def peak_pair_idcs(freq_data): freq_heights = abs(freq_data) # squares and sums the components paired_heights = freq_height[...,1:-1] + freq_height[...,2:] peak_idx = paired_heights.argmax(axis=-1, keepdims=True) + 1 return np.concatenate(peak_idx, peak_idx + 1, axis=-1)
still using this list as my github attached code is untested which means the bugs i always make are unaddressed i tried to implement the interface properties i mentioned this morning
I'm trying out testing extracting a repeating signal from a larger and there is a bug that looks debuggable! I generated a repeating signal by indexing the random vector with the floor of modular time, and added a custom wavelet parameter to the fourier functions to model it, and passed a square wave as the wavelet so as to model sampling by indexing with floors. It currently throws an assertion failure, but the mismatching vectors have the same values in them, just in different spots, which should really help figure the bug out!
/shared/src/scratch/fourier.py(208)test() -> assert np.allclose(longvec, inserting_spectrum @ inserting_ift) (Pdb) p longvec array([0.73764179, 0.73764179, 0.73764179, ..., 0.85448175, 0.85448175, 0.85448175]) (Pdb) p inserting_spectrum @ inserting_ift array([0.73764179, 0.85448175, 0.23715969, ..., 0.330343 , 0.82712795, 0.66150281])
I was thinking I would just go through the composition of sums via the two code paths and compare them, like earlier. The first sample seems to generally match, so it would be the second sample that's of interest. The previous fourier.py did not have this test and so does not raise an assertion failure.
NOTE: I recall I was thinking the max_freq should slide up to higher than 0.5 when there is a min_freq higher than 1/n (and no max_freq specified). notes - the test code indexes by floor to produce the signal - the comparison code transforms the signal out of time sequence, then back in at a different rate considering transformation: - the freq2time transformation produces a matrix where columns are time, and rows are frequencies - inverting this matrix and multiplying time data by it, produces frequency data, where each offset in the vector is a frequency and the vector contains magnitude - multiplying the frequency vector by the matrix recovers the time data I've set np.random.seed(0) so I can reproduce the same data. 198 # extract a repeating wave 199 longvec = np.empty(len(randvec)*100) 200 B short_count = np.random.random() * 4 + 1 201 short_duration = len(longvec) / short_count 202 -> for idx in range(len(longvec)): 203 longvec[idx] = randvec[int((idx / len(longvec) * short_duration) % len(randvec))] (Pdb) p short_count 4.330479382191752 (Pdb) p short_duration 369.47410639563054 (Pdb) p len(longvec) 1600 (Pdb) p len(randvec) 16 The data is stretched to a very low frequency (16 samples -> 369 samples), so it should be easy to extract. (Pdb) p ((np.array([0,1,2])/len(longvec) * short_duration) % len(randvec)) array([0. , 0.23092132, 0.46184263]) (Pdb) p [int(x) for x in ((np.array([0,1,2])/len(longvec) * short_duration) % len(randvec))] [0, 0, 0] (Pdb) p randvec[0] 0.5488135039273248 54 min_freq = 1 / repetition_time 55 elif repetition_samples: 56 -> min_freq = freq_sample_rate / repetition_samples 57 else: 58 min_freq = freq_sample_rate / freq_count (Pdb) p repetition_samples 369.47410639563054 (Pdb) p freq_sample_rate 1 (Pdb) p freq_sample_rate / repetition_samples 0.0027065496138698455 (Pdb) p freq_count 30 freq_count is 30 because it is making 16 positive frequencies, so there would be 14 negative ones. min_freq will be 0.0027 . I am guessing that there is a bug showing here, but I think it will be clearer for me to see the impact when the data is near. 58 min_freq = freq_sample_rate / freq_count 59 min_freq /= sample_rate 60 -> if freq_count % 2 == 0: 61 if max_freq is None: (Pdb) p sample_rate 1 (Pdb) p min_freq 0.0027065496138698455 I am thinking part of a logical error could relate to the sample rates being passed the same, but actually being quite different. Making clear definitions of the meanings of the two sample rates, and reviewing the fftfreq function to enforce those meanings, could seem helpful. 106 offsets = np.arange(time_count) 107 #mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) 108 -> mat = wavelet(np.outer(freqs, offsets)) 109 return mat / len(freqs) # scaled to match numpy convention (Pdb) p freqs array([0. , 0.00270655, 0.03822751, 0.07374847, 0.10926943, 0.14479039, 0.18031135, 0.21583231, 0.25135327, 0.28687424, 0.3223952 , 0.35791616, 0.39343712, 0.42895808, 0.46447904, 0.5 ]) (Pdb) p offsets array([ 0, 1, 2, ..., 1597, 1598, 1599]) (Pdb) p np.outer(freqs,offsets)[:,:2] array([[0. , 0. ], [0. , 0.00270655], [0. , 0.03822751], [0. , 0.07374847], [0. , 0.10926943], [0. , 0.14479039], [0. , 0.18031135], [0. , 0.21583231], [0. , 0.25135327], [0. , 0.28687424], [0. , 0.3223952 ], [0. , 0.35791616], [0. , 0.39343712], [0. , 0.42895808], [0. , 0.46447904], [0. , 0.5 ]]) (Pdb) p wavelet(np.outer(freqs,offsets)[:,:2]) array([[-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., -1.], [-1., 1.]])
i'm having a lot of difficulty continuing to poke at this! [or anything kind of!] i've mutated the file 0416 i've changed the calculation of the default max_freq so that it is done as a ratio of min_freq. i think this makes more correct interpretations of subsignals i.e. every frequency in the scaled signal is itself scaled. i'm handling psychoticy stuff. i'm considering describing what i call a "psychotic break" as a "panic attack paired with a trance". the trance part aligning with how the mind kind of ignores checks and just experiences things. (Pdb) p inserting_spectrum @ inserting_ift array([0.5488135 , 0.5488135 , 0.5488135 , ..., 0.91970108, 0.91970108, 0.91970108]) (Pdb) p longvec array([0.5488135 , 0.5488135 , 0.5488135 , ..., 0.5488135 , 0.71518937, 0.71518937]) the frequencies look much lower with the max_freq change. i also moved the min_freq and max_freq claculations out of the conditional, and added a flag, to condense the code a little bit. snippet containing changes: else: min_freq = freq_sample_rate / freq_count user_provided_max_freq = max_freq is not None if max_freq is None: #max_freq = freq_sample_rate / 2 max_freq = freq_count * min_freq / 2 min_freq /= sample_rate max_freq /= sample_rate if freq_count % 2 == 0: if complex: neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True) pos_freqs = -neg_freqs[:0:-1] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = pos_freqs[:0] else: if not user_provided_max_freq: max_freq -= freq_sample_rate / (2 * freq_count * sample_rate) pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True)
(Pdb) p longvec[:16] array([0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.71518937, 0.71518937, 0.71518937, 0.71518937, 0.60276338, 0.60276338, 0.60276338, 0.60276338, 0.54488318, 0.54488318, 0.54488318]) (Pdb) p (inserting_spectrum @ inserting_ift)[:16] array([0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.5488135 , 0.71518937, 0.87543105, 0.82261755]) It looks like the first differing sample is the sixth, which is index 5. (Pdb) p longvec[5] 0.7151893663724195 (Pdb) p (inserting_spectrum @ inserting_ift)[5] 0.5488135039273259 It's also notable that inserting_spectrum @ inserting_ift is making a single sample of 0.71518937 . It should be too low frequency to make a single isolated sample like that. 0435 0437 Given the wave is scaled over time, the function that produces it should have its time parameter scaled by the same amount. The frequencies are passed directly as time parameter scalings, so they too should be scaled. It looks like the current code is doing that, but by a slightly different value: (Pdb) p fftfreq(len(randvec), complex = False, dc_offset = True) / fftfreq(len(randvec), complex = False, dc_offset = True, repetition_samples = short_duration) <string>:1: RuntimeWarning: invalid value encountered in divide array([ nan, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355, 12.31580355]) (Pdb) p short_duration / len(randvec) 23.09213164972691 (Pdb) p short_duration / len(randvec) /2 11.546065824863454
i noticed mistakes in the calculation of max_freq when freq_count was odd, and it now looks like this: else: min_freq = freq_sample_rate / freq_count if max_freq is None: #max_freq = freq_sample_rate / 2 max_freq = freq_count * min_freq / 2 if freq_count % 2 != 0: #max_freq -= freq_sample_rate / (2 * freq_count) max_freq -= min_freq / 2 min_freq /= sample_rate max_freq /= sample_rate if freq_count % 2 == 0: if complex: neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True) I'm guessing the issue in the overall frequency scaling relates to min_freq, but I wanted to address the mistakes when I noticed them.
shortspace_freqs = fftfreq(len(randvec), complex = False, dc_offset = True) longspace_freqs = fftfreq(len(randvec), complex = False, dc_offset = True, repetition_samples = short_duration) assert np.allclose(longspace_freqs, shortspace_freqs * short_duration / len(randvec)) 0455 shortspace_freqs does this: (Pdb) list 54 elif repetition_time: 55 min_freq = 1 / repetition_time 56 elif repetition_samples: 57 min_freq = freq_sample_rate / repetition_samples 58 else: 59 -> min_freq = freq_sample_rate / freq_count 60 if max_freq is None: 61 #max_freq = freq_sample_rate / 2 62 max_freq = freq_count * min_freq / 2 63 if freq_count % 2 != 0: 64 #max_freq -= freq_sample_rate / (2 * freq_count) (Pdb) p freq_sample_rate 1 (Pdb) p freq_count 30 (Pdb) p freq_sample_rate / freq_count 0.03333333333333333 (Pdb) p shortspace_freqs array([0. , 0.03333333, 0.06666667, 0.1 , 0.13333333, 0.16666667, 0.2 , 0.23333333, 0.26666667, 0.3 , 0.33333333, 0.36666667, 0.4 , 0.43333333, 0.46666667, 0.5 ]) longspace_freqs does this: (Pdb) list 52 if repetition_rate: 53 min_freq = repetition_rate 54 elif repetition_time: 55 min_freq = 1 / repetition_time 56 elif repetition_samples: 57 -> min_freq = freq_sample_rate / repetition_samples 58 else: 59 min_freq = freq_sample_rate / freq_count 60 if max_freq is None: 61 #max_freq = freq_sample_rate / 2 62 max_freq = freq_count * min_freq / 2 (Pdb) p freq_sample_rate 1 (Pdb) p repetition_samples 369.47410639563054 (Pdb) p freq_sample_rate / repetition_samples 0.0027065496138698455 (Pdb) p longspace_freqs array([0. , 0.00270655, 0.0054131 , 0.00811965, 0.0108262 , 0.01353275, 0.0162393 , 0.01894585, 0.0216524 , 0.02435895, 0.0270655 , 0.02977205, 0.0324786 , 0.03518514, 0.03789169, 0.04059824]) 0459 OK. So, the error is because I am scaling by freq_count, which when complex = False is scaled up from 16 to 30. The scaling up is simply so it can represent the number of frequencies returned, but compare clearly against np.fft.fftfreq and np.fft.rfftfreq . I'm using freq_count here as the same thing as the output sample count, which it is not. The bug should only be encountered when complex = False. Thinking ... The frequencies here kind of assume a concept of how many samples are output to. Multiplying each one by an increment of 1, advances the sample offset of the signal generated, and the signal should loop after the total samples. So, fftfreq does assume a time-domain sample count that is equal to one period. Once could then interpret the output when repetition_samples is set, as more correct than when it is not. The immediate question, then, is what time-domain sample length should the function assume when complex = False and complex = True? Assuming we keep freq_count as meaning the number of frequency bins, then it would be sample_count=freq_count for complex = True and sample_count=(freq_count-1)*2 for complex=False . This is near what it already does. Something I've been doing is using complex = False and imagining the same number of frequency bins and sample bins. The idea is that I get more frequencies and it feels more accurate or precise. This is silly because there isn't actually that much data there. I'm likely imagining having a set buffer size of say 1KB, and wanting to use as much of it as I can for frequencies, while I parse data that is maybe 1MB in 1KB chunks. So, it's kind of a silly beginning of a possibly larger scenario. In the 1KB/1MB scenario, it could represent data that is 2KBs long, using 1KB of frequencies. That seems reasonable to provide for. Meanwhile, here I am foolishly using 16 frequency bins when I only have 16 samples of data. It would be nice to provide for that if the user wants to try it, using the pseudoinverse, but it doesn't quite seem like the right thing to do. It should indeed be assuming 30 samples of interesting data in such a situation. This could be a change to create_freq2time, where it sets time_count depending on whether or not freqs contains negative frequencies. I'm trying changing create_freq2time: if freqs is None: freqs = fftfreq(time_count) elif time_count is None: if freqs[-1] < 0: # complex data time_count = len(freqs) else: time_count = (len(freqs) - 1) * 2 I'm stepping away from the system and could re-engage my issues that make it hard to do, so I'm attaching the file to make it easier to find again.
I've gotten by that assertion. The resulting interface could be more clear, but it works for now. I pass None as the freq_count and it calculates it from repetition_samples to hold a single repetition. 0613 The next assertion appears to be because the inversion of the 1-1 to matrix is pseudo. This is because the matrix is not square: it is 9x16 because it contains only the frequencies for real-valued data, but for 16 samples. This is the next confusion to make rational! It made a 9x16 matrix, which is likely a subregion of the 16x16 matrix it was making before. I'm thinking of the number of unknowns and vlaues contained in there, and how the sinusoids are complex and basically 2-valued, and thinking it should expand the frequencies to 16x16. 0618 What is the user interface? For a real inverse transform, we pass a short vector of frequency data, and receive a long vector of time data. . So there should indeed be a rectangular matrix. Then, for a real forward transform, we pass a long vector of frequency data, and receive a shor vector of time data. So there should again indeed be a rectangular matrix. Then for the internals, I'm thinking that the current wavelet() function is not quite matching what would be used for an rfft. In np.fft.irfft, the result is the same as for the complex fft, as if the positive and negative frequencies were summed. I think? I can maybe add a bunch of asserts that the two generate the right things. 0623 I mutated some asserts to test this. Gotta get that real-domain transform to be the same as numpy's. 0629 I mutated more asserts and am narrowing down the issue. I'm sending this to help my feelings sort out the layout of spam on this list a little bit.
174 rfreqs15t = fftfreq(repetition_samples=15, complex=False) 175 rfreqs15f = fftfreq(15, complex=False) 176 irft15 = create_freq2time(freqs=rfreqs15f) 177 rft15 = create_time2freq(15, freqs=rfreqs15t) 178 randvec2rtime15 = randvec[:15] @ irft15 179 randvec2rfreq15 = randvec[:15] @ rft15 180 randvec2irfft = np.fft.irfft(randvec[:15]) 181 randvec2rfft = np.fft.rfft(randvec[:15]) 182 assert np.allclose(rfreqs15t, np.fft.rfftfreq(15)) 183 assert np.allclose(rfreqs15f, np.fft.rfftfreq(28)) 184 -> assert np.allclose(randvec2rtime15, randvec2irfft) 185 assert np.allclose(randvec2rfreq15, randvec2rfft) AssertionError Uncaught exception. Entering post mortem debugging Running 'cont' or 'step' will restart the program
/shared/src/scratch/fourier.py(184)test() -> assert np.allclose(randvec2rtime15, randvec2irfft)
For the frequency-to-time case, I'm guessing I can massage the matrix or the wavelet function in some simple manner to produce the right output, not sure. (Pdb) p randvec2rtime15 array([ 6.02864003e-01+0.00000000e+00j, 5.58328484e-03+3.78540383e-01j, 3.92172708e-05-2.86976335e-02j, 4.78542159e-02+9.61918533e-02j, 5.70841954e-02+1.27608079e-02j, 1.08246842e-02+1.21853206e-01j, -1.71864375e-02-2.09207363e-02j, 4.00709294e-02+4.69713689e-02j, 1.86474431e-02+7.08631802e-03j, -2.28384953e-03+7.67270253e-02j, -3.86887344e-02-6.78859201e-02j, 1.04829207e-01+1.97591142e-02j, -2.44517088e-02+4.59628083e-02j, 1.60843357e-02+1.96245569e-03j, -1.52256954e-02+5.27572826e-17j, 1.60843357e-02-1.96245569e-03j, -2.44517088e-02-4.59628083e-02j, 1.04829207e-01-1.97591142e-02j, -3.86887344e-02+6.78859201e-02j, -2.28384953e-03-7.67270253e-02j, 1.86474431e-02-7.08631802e-03j, 4.00709294e-02-4.69713689e-02j, -1.71864375e-02+2.09207363e-02j, 1.08246842e-02-1.21853206e-01j, 5.70841954e-02-1.27608079e-02j, 4.78542159e-02-9.61918533e-02j, 3.92172708e-05+2.86976335e-02j, 5.58328484e-03-3.78540383e-01j]) (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02])
I'm thinking it could make sense to just do it complex-domain, and let the matrices be mutated to real after the fact. I'll look at what's needed for this one. hmmm ... i could create the complex frequencies from the real ones pretty easily. but why then would i make real-only frequencies? given the real-only frequencies presently passed, i could make them complex by negative the last one, and adding a negative copy of [1:-1] to the end. the frequencies are the columns, so that would make the columns higher. and, i could perform the same operation to the columns, to test it. 0701 ok i made a 28x28 matrix from the 15x28 with np.concatenate([irft15[:-1], -irft15[-1:], -irft15[-2:0:-1]], axis=0) . what i forgot is i then need to do that same thing to the frequency data. 0702 ... i think? uhhh ... sure? 0704 well i think it didn't work, but again it's a completely closed space so troubleshootable (Pdb) p np.concatenate([randvec[:15-1], -randvec[15-1:15], -randvec[15-2:0:-1]], axis=0) @ np.concatenate([irft15[:-1], -irft15[-1:], -irft15[-2:0:-1]], axis=0) array([ 1.16440470e+00+0.00000000e+00j, -2.06852600e-02+7.57080766e-01j, -4.12448696e-02-5.73952669e-02j, 6.38566021e-02+1.92383707e-01j, 7.28450867e-02+2.55216158e-02j, -1.02024613e-02+2.43706412e-01j, -7.56961791e-02-4.18414726e-02j, 4.82900291e-02+9.39427377e-02j, -4.02841798e-03+1.41726360e-02j, -3.64195288e-02+1.53454051e-01j, -1.18700773e-01-1.35771840e-01j, 1.77806585e-01+3.95182283e-02j, -9.02267217e-02+9.19256166e-02j, 3.16841781e-04+3.92491139e-03j, -7.17746950e-02+1.13634013e-16j, 3.16841781e-04-3.92491139e-03j, -9.02267217e-02-9.19256166e-02j, 1.77806585e-01-3.95182283e-02j, -1.18700773e-01+1.35771840e-01j, -3.64195288e-02-1.53454051e-01j, -4.02841798e-03-1.41726360e-02j, 4.82900291e-02-9.39427377e-02j, -7.56961791e-02+4.18414726e-02j, -1.02024613e-02-2.43706412e-01j, 7.28450867e-02-2.55216158e-02j, 6.38566021e-02-1.92383707e-01j, -4.12448696e-02+5.73952669e-02j, -2.06852600e-02-7.57080766e-01j]) (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) 0705 0707
[insert short sequence of random and crazy interjections-grammatical] confused around interjections-grammatical 0721 i have an appointment today. i'll focus on increasing my likelihood of making it.
(Pdb) p np.fft.fft(randvec2irfft).real array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.92559664, 0.56804456, 0.52889492, 0.79172504, 0.38344152, 0.96366276, 0.891773 , 0.43758721, 0.64589411, 0.4236548 , 0.54488318, 0.60276338, 0.71518937]) (Pdb) p np.concatenate([randvec[:15-1], randvec[15-1:15], randvec[15-2:0:-1]]) array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.92559664, 0.56804456, 0.52889492, 0.79172504, 0.38344152, 0.96366276, 0.891773 , 0.43758721, 0.64589411, 0.4236548 , 0.54488318, 0.60276338, 0.71518937]) so the full complex frequencies are the positive reflection of the real ones. (Pdb) doublerandvec = np.concatenate([randvec[:15-1], randvec[15-1:15], randvec[15-2:0:-1]], axis=0) (Pdb) p np.fft.ifft(doublerandvec).real array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) doublerandvec is the frequencies, and i'm trying to debug the inverse fourier transform, which is easy to put into a matrix. (Pdb) freqs15f = fftfreq(len(doublerandvec), complex=True) (Pdb) p rfreqs15f array([0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, 0.5 ]) (Pdb) p np.fft.rfftfreq(len(doublerandvec)) array([0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, 0.5 ]) rfreqs15f is the real space frequencies for randvec. doublerandvec has the complex space frequencies. (Pdb) p freqs15f array([ 0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, -0.5 , -0.46428571, -0.42857143, -0.39285714, -0.35714286, -0.32142857, -0.28571429, -0.25 , -0.21428571, -0.17857143, -0.14285714, -0.10714286, -0.07142857, -0.03571429]) (Pdb) p np.fft.fftfreq(len(doublerandvec)) array([ 0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, -0.5 , -0.46428571, -0.42857143, -0.39285714, -0.35714286, -0.32142857, -0.28571429, -0.25 , -0.21428571, -0.17857143, -0.14285714, -0.10714286, -0.07142857, -0.03571429]) i made freqs15f to hold the full frequencies so now i can make a normal complex matrix, and compare it to the faulty real-valued ones (Pdb) ift15 = create_freq2time(freqs=freqs15f) i'm poking at it, and it looks like my analysis of rows and columns has been wrong in spots. the first dimension is the row. the second is the column. (Pdb) p ift15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429+2.62424314e-17j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429-1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j]) (Pdb) p irft15[:,6] array([ 0.06666667+0.00000000e+00j, 0.01483473+6.49951941e-02j, -0.06006459+2.89255826e-02j, -0.04156599-5.21220988e-02j, 0.04156599-5.21220988e-02j, 0.06006459+2.89255826e-02j, -0.01483473+6.49951941e-02j, -0.06666667+2.44929360e-17j, -0.01483473-6.49951941e-02j, 0.06006459-2.89255826e-02j, 0.04156599+5.21220988e-02j, -0.04156599+5.21220988e-02j, -0.06006459-2.89255826e-02j, 0.01483473-6.49951941e-02j, 0.06666667-4.89858720e-17j]) The matrices differ by a factor of approximately but not exactly 2.0 . I can now troubleshoot a problem cause to the scaling by freq_count in create_freq2time . This freq_count is wrong for real-valued data. [celebration] 0829 0842 I changed create_freq2time to keep track of freq_count and things look somewhat better. the assertion still fails. (Pdb) ift15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429+2.62424314e-17j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429-1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j]) (Pdb) irft15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429-2.62424314e-17j]) So, now the action of double the repeated range should make the right result. And I'm thinking I could undo that, when taking the inverse, by halving and mirroring them. 0849 Looks like the concatenation does now work: (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) (Pdb) p (doublerandvec @ np.concatenate([irft15[:-1], irft15[-1:].conj(), irft15[-2:0:-1].conj()], axis=0)).real array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) so for those middle values, I can pass likely both the negative and the positive frequencies to the wavelet function and sum them. 0857 I added the summation and the assertion around the easier inverse transform is now passing. next is the forward matrix. i have some interest in, later, spending time organising some of these conditional cases; there's redundancy and special-casing. it's cool that it makes working real-valued inverse transform matrices now; they're half the size. 0900 0903 this has a clear inverse, which needs a square matrix. i was thinking i could make it square before inverting it. the matrix has only real components, rather than the imaginary components present in the full complex matrix. can i still turn it into a working square matrix? if not, the code that generates it does have access to the full frequencies.
for generating a real-domain forward fourier matrix, i'm working on massaging the inverse of the complex-domain inverse matrix. atm it isn't working. it's 1228. i have another task building in me. i am close to stabilising this! 1238 1241 1242 it's quite hard to stay here. i've changed some veriable names to clarify things. the next step looks like it would be to do out the matrix multiplication of the inverse by inverse frequency data. 1253 okay, so the real-domain spectrum is eqiuvalent to a complex-domain spectrum where the negative frequencies are the complex conjugates of the positive. then, how do i craft a matrix that treats its input as if it is a complex conjugate pair? is this something that is done? it can form a linear combination of the input. the input is mostly complex conjugate pairs. ... wait i think i am actually already doing that, in the working freq2time code. then time2freq is where it generates that complex data. so, previously it was generating complex conjugate pairs and i massage it so that it generates basically only half that data for the now-called inverse_mat, columns contained frequencies, rows contained samples rows are the first index, so that's samples,frequencies and it output frequencies? no, it output samples, from frequencies now we're in the now-called forward_mat, so columns will contain samples, and rows will contain frequencies, and it will output frequencies. the indexing will be [frequencies, samples] in the mat. 1259 no it looks like it is [samples, frequencies] >( it must have been [frequencies, samples] in the inverse mat. right ... columns contain frequencies, so selecting which row selects which frequency. ... i get it some. [samples, frequencies] in this mat. 1303 1305 i successfully produced a frequency by hand by averaging the two complex rows 1307 but then i realised, when i average them, i lose the phase information. rather than averaging them i just need to truncate the matrix so it only outputs the first few items 0o. 0o is 0,o with the eyes squeezed together. 1308 1314 the failing assertion is now passing. I'm using this to convert from a complex-input fft matrix to a real-input fft matrix: 169 forward_mat = np.concatenate([ 170 forward_mat[:,:tail_idx], 171 forward_mat[:,tail_idx:neg_start_idx].conj() 172 ], axis=1) The .conj() line is for the nyquist frequency; I have not tested that yet. 1318 the new tests i added are passing now, but not the insertion/extraction that uses a custom wavelet function. which makes sense, since my real-domain stuff all assumed complex sinusoids. first i'll add an assertion for the nyquist frequency. 1429 the assertion for the nyquist frequency passed. next would be the real-domain wave function not succeeding with the real-domain inverse matrix. attaching and sending
according to numpy my square wave matrix is singular. something for me to think about! i'm guessing a big issue is that my wavelet function doesn't have a way to shift phase. i don't really know much about wavelets; it's a word i remember from being a teenager, learning about these things in the resources i knew how to find. i don't know if it's the cause, but my wavelet will need a way to have phase. i could interpret the complex argument in terms of angle and magnitude, and simply phase shift the output. this seems like it would work fine, at first. but i'm not sure if it would survive the whole process, with no complex output. the trick with the sinusoids is that in frequency space, the phase is held as an angle in the complex plane. when the freq2time matrix is created, where does the complex output come from, and what meaning does it hold? it has only real-domain frequencies, and yet produces a matrix that converts from complex frequency. it takes the frequencies, and scales them by the offsets, and turns that into time parameters it passes the time parameters to the sinusoids. this produces a matrix that can process complex frequency data, and convert it to time data. each frequency is a magnitude and a phase. i can make a function that takes complex frequency data but the frequencies that are passed in are not complex: they are negative and positive. where the negative is the complex conjugate i.e. the angle iand hence phase is treated as negative. there are various things i could try but i'm trying to figure out what's going on in general. exp(2i pi x) produces a complex number rotated with an angle x how does this produce real-domain data? if i pass in a single frequency with a 0 angle, what does it do? i tried this: (Pdb) p np.array([0,1,0,1]) @ create_freq2time(4) array([ 5.00000000e-01+0.j, 3.06161700e-17+0.j, -5.00000000e-01+0.j, -9.18485099e-17+0.j]) so how does that work? (Pdb) p create_freq2time(4).round(2) array([[ 0.25+0.j , 0.25+0.j , 0.25+0.j , 0.25+0.j ], [ 0.25+0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25+0.j , -0.25-0.j , 0.25+0.j , -0.25-0.j ], [ 0.25+0.j , 0. -0.25j, -0.25-0.j , -0. +0.25j]]) columns are freq, rows are samps The first sample gets 0.25 + 0.25 = 0.5 The second sample is the only one with complex domain stuff, and its 0.25 components cancel each other. The third has -0.25 at the two indices. The fourth cancels again. I'm going to paste it again. array([[ 0.25+0.j , 0.25+0.j , 0.25+0.j , 0.25+0.j ], [ 0.25+0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25+0.j , -0.25-0.j , 0.25+0.j , -0.25-0.j ], [ 0.25+0.j , 0. -0.25j, -0.25-0.j , -0. +0.25j]]) It's 4 complex sinusoids, but they oppose each other in the complex domain. Because a real-valued signal has both negative and complex frequencies equal, and because the complex component of the wavelet function is odd, the two cancel. That would work fine for phaseless square wave data. You'd provide an even function square wave for the real component, and an odd function square wave for the imaginary component. What about phase offset? (Pdb) p (np.array([0,1+1j,0,1+1j]) @ create_freq2time(4)).round(3) array([ 0.5+0.5j, 0. +0.j , -0.5-0.5j, -0. -0.j ]) When I specify complex numbers for the frequencies, it actually produces a wave with the same phase, that has complex value, not a phased wave ! I thought the angle of the frequency was the phase, but it isn't? (Pdb) p np.fft.ifft(np.array([0,1+1j,0,1+1j])).round(3) array([ 0.5+0.5j, 0. +0.j , -0.5-0.5j, 0. +0.j ]) That appears to be so. That doesn't seem very useful to me, it seems it would be more useful if it were. Ohhhh I need to use the complex conjugate! (Pdb) p (np.array([0,1+1j,0,1-1j]) @ create_freq2time(4)).round(3) array([ 0.5+0.j, -0.5+0.j, -0.5+0.j, 0.5+0.j]) There we go. I offset it by 45 degrees? (Pdb) p (np.array([0,1,0,1]) @ create_freq2time(4)).round(3) array([ 0.5+0.j, 0. +0.j, -0.5+0.j, -0. +0.j]) I'm guessing the peaks for 45 degrees were between the samples. 90 might be clearer. Yeah here's 90 degrees offset: (Pdb) p (np.array([0,1j,0,-1j]) @ create_freq2time(4)).round(3) array([ 0. +0.j, -0.5+0.j, -0. +0.j, 0.5+0.j]) So how does that happen? (Pdb) p create_freq2time(4).round(3) array([[ 0.25+0.j , 0.25+0.j , 0.25+0.j , 0.25+0.j ], [ 0.25+0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25+0.j , -0.25-0.j , 0.25+0.j , -0.25-0.j ], [ 0.25+0.j , 0. -0.25j, -0.25-0.j , -0. +0.25j]]) Passing (0,1j,0,-1j) ... The first two frequencies have opposing components so they cancel. That would work with a 90 degree square wave. In the second .. it's 1j and -1j. the j's multiply to -1, and we get -.25 and -.25, which sum to 0.5 . the 90 degree case would work with square waves. but why does it work for sinusoids? cos(x) + sin(x)j = exp(2j pi x) is applied, not to the input data x, but rather to the frequency. then when the input data comes in, it is multiplied against that result. so if it's real, the sinusoid components cancel and the cosine is extracted. this then produces a real cosine wave. if it's imaginary and self-opposing, the cosines cancel from the self-opposingness, whereas the sines amplify, producing a wave that is 90 degrees off. how would a 45 degree wave be produced? what is that? it seems it would be a sum of the 0 degree and 90 degree waves, not sure. i looked at trigonometric identities and i found a relation that looked complex. let's see how it goes (Pdb) p create_freq2time(4).round(3) array([[ 0.25+0.j , 0.25+0.j , 0.25+0.j , 0.25+0.j ], [ 0.25+0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25+0.j , -0.25-0.j , 0.25+0.j , -0.25-0.j ], [ 0.25+0.j , 0. -0.25j, -0.25-0.j , -0. +0.25j]]) (Pdb) p (np.array([0,1+1j,0,1-1j]) @ create_freq2time(4)).round(3) array([ 0.5+0.j, -0.5+0.j, -0.5+0.j, 0.5+0.j]) first sample: (1+1j)(0.25) + (1-1j)(0.25) = 0.5 second sample: (1+1j)(0.25j) + (1-1j)(-0.25j) = -0.5 so the first sample is just the real component then again the second sample is just the complex component it really is two waves, rising at different times, offset by 90 degrees, or such (Pdb) p (np.array([0,0.5+1j,0,0.5-1j]) @ create_freq2time(4)).round(3) array([ 0.25+0.j, -0.5 +0.j, -0.25+0.j, 0.5 +0.j]) One could think of it as outputting the data in pairs. It outputs a real component, then a complex component, both in the real domain. Because one is offset by 90 degrees, this can provide for the construction of waves smoothly. It may work for the square wave. I guess I'll try it. 1600 I made a complex square wave. Basically the real and imaginary parts are juts off by 1/4 . The forward matrix it produces has an inverse, and the pseudoinverse passes np.allclose() with it. The assertion still doesn't pass, but again it should now be a closed system. 1603 I'm looking a little at the real-domain matrix produced for the square wave, and I'm not yet sure about it. (Pdb) list 174 forward_mat = np.linalg.pinv(inverse_mat) 175 if not is_complex: 176 # todo: remove test data 177 time_data = np.random.random(inverse_mat.shape[1]) 178 freq_data = time_data @ forward_mat 179 -> forward_mat = np.concatenate([ 180 forward_mat[:,:tail_idx], 181 forward_mat[:,tail_idx:neg_start_idx].conj() 182 ], axis=1) 183 return forward_mat 184 (Pdb) normal_forward_mat = np.linalg.pinv(create_freq2time(time_count, freqs)) (Pdb) p normal_forward_mat[6].round(3) array([ 1. +0.j , -0.707-0.707j, -0. +1.j , 0.707-0.707j, -1. -0.j , 0.707+0.707j, -0. -1.j , -0.707+0.707j, 1. -0.j , -0.707-0.707j, -0. +1.j , 0.707-0.707j, -1. -0.j , 0.707+0.707j, 0. -1.j , -0.707+0.707j]) (Pdb) p forward_mat[6].round(3) array([-0.5+0.5j, 0.5+0.5j, -0.5-0.5j, -0.5+0.5j, 0.5-0.5j, -0.5-0.5j, 0.5+0.5j, 0.5-0.5j, -0.5+0.5j, 0.5+0.5j, -0.5-0.5j, -0.5+0.5j, 0.5-0.5j, -0.5-0.5j, 0.5+0.5j, 0.5-0.5j]) In the normal dft matrix, the negative frequencies that occupy the second half, are the reverse-order complex conjugates of the positive frequencies that occupy the first half. In the forward_mat here, where the the square wave is being used, this complex conjugate situation does not hold. It's 1609. Basically, the even-numbered indices are conjugates, whereas the odds are opposites. It might be possible this could relate to the evaluation of the square wave at its discontinuous points. With real frequencies, the implication is that there are never frequencies equal to 0.125, for example. Looking a little further I'm not immediately discerning a pattern. 1640 having trouble developing behaviors and energy and stuff around this next step. i'm looking at inverse_mat and it seems to have a similar issue. i'm thinking that the real component of the waveform is not an even function, and that that is the issue here. trying out numbers, it kind of looks like the wavelet is getting indexed in a way that is rounding negative, rather than toward zero. that's something i can look into! this is what I have: SINE = lambda x: np.sin(2 * np.pi * x) SQUARE = lambda x: np.floor(x * 2) * 2 - 1 def complex_wavelet(wavelet, x): return wavelet((x + 0.25) % 1) + wavelet(x % 1) * 1j and I'm getting this:
complex_wavelet(SQUARE, -.25) (-1+1j) complex_wavelet(SQUARE, .25) (1-1j)
the real component should have the same sign for correct behavior. other than the cutoff point, it is even:
complex_wavelet(SQUARE, .24) (-1-1j) complex_wavelet(SQUARE, -.24) (-1+1j) complex_wavelet(SQUARE, .26) (1-1j) complex_wavelet(SQUARE, -.26) (1+1j)
SINE(0) 0.0 SINE(0.5) 1.2246467991473532e-16 SQUARE(0) -1.0 SQUARE(0.5) 1.0
The sine function has 0 and 1/2 equal, whereas the square function does not. If the square function had 1.0 | x==0 or 1.0 | x > 0.5, it would work, but instead it has 1.0 | x >= 0.5 . The difference stems from floor(x * 2). With the floor function, lower values are included in the range, and upper values are not. I wonder if it would be appropriate to offset things by 0.125 simply to make the behavior look even when sampled at the points : but this wouldn't work for denser sampling, of course, which I imagine is needed. 1703 I changed SQUARE to (x > 0.5) * 2 - 1, which kinda works because True is treated like 1 and False treated like 0. Assertion still failing. Note: if an even function is required for a wavelet to work, it would make sense to define them as even functions. 2036 I ended up doing other things. The matrix did look better with the change. I would like to look at it more.
0458 ! Where I left off yesterday, after giving complex phase to my square wave, the matrix looks better, but there are still components that aren't conjugates. (Pdb) p inverse_mat[:,6] array([-0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625-0.0625j, -0.0625+0.0625j, -0.0625-0.0625j, 0.0625+0.0625j]) (Pdb) p freqs array([ 0. , 0.0625, 0.125 , 0.1875, 0.25 , 0.3125, 0.375 , 0.4375, -0.5 , -0.4375, -0.375 , -0.3125, -0.25 , -0.1875, -0.125 , -0.0625]) It looks to me like the first component that isn't a conjugate when negated is frequency 0.25 . 0.25 makes 0.0625-0.0625j, whereas -0.25 makes 0.0625-0.0625j: the two values are the same. I've likely made this situation by having a function that doesn't produce the property. The 0-offset sample shows it much easier: (Pdb) p inverse_mat[:,0] array([-0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j]) It looks like it has no conjugates at all. (Pdb) p create_freq2time(time_count, freqs, SINE)[:,0] array([0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j]) (Pdb) p create_freq2time(time_count, freqs, SQUARE)[:,0] array([-0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j, -0.0625-0.0625j]) My square wave never outputs a 0 value, so it is impossible for it to match its complex conjugate at time=0, which has no negative. This likely happens again at time=1, time=2, etc. (Pdb) p complex_wavelet(SQUARE, np.array([0,0,1,-1,2,-2])) array([-1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j, -1.-1.j]) They're all equal, with no conjugates at all! Thinking about this a little. - i could change the wave so its minimum is 0, but then it would not make negative conjugates unless i also negated the whole wave when negative - i could make the wave a step wave that touches 0, 1, and -1, but then it wouldn't be square any more - i could also interpret the whole transform a little differently, like i did to make the matrix supporting off-frequencies in the first place, to facilitate this I'm thinking that negating the wave when it is negative would match the behavior of sine. I'm accumulating the properties of the cosine wave that let it get replaced by other functions. - even i.e. f(x) == f(-x) - zero at 90 degrees f(0.25) == 0 - negative at 180 degrees f(0.5) == -f(0) Given my current complex wavelet function is taking the modulus of x to be near zero, if I made an odd-function square wave that had a base of 0 at 0 and hit +1 at t>0 and -1 at t<0, this could be considered equivalent to a 3-valued step wave, in some way, since x would loop, or I could stop looping x. 0545 Instead of a square wave, I'm just using STEP = lambda x: COSINE(x).round() . This significantly reduces the changes I need to comprehend. I'm thinking what's notable for the real-domain forward matrix, is the frequencies need to be reconstructible from only half of them. At the moment, that means that the second half _of the actual frequency data_ must be the reversed complex conjugates of the first half. A simpler approach to this could have been to not use complex numbers at all, and instead break data into two separate values. Then it would be obvious how to engage real-domain data: you'd just remove the complex data. 0557 The assertion is still failing, but the frequencies look correct: (Pdb) p (randvec @ create_time2freq(freqs=fftfreq(complex = True, repetition_samples = 16), wavelet=STEP)).round(2) array([ 9.13-0.j , -1.17+0.32j, 0.24-0.7j , -0.43-1.37j, 0.6 -0.62j, 0.16-0.9j , 0.8 +1.07j, -0.22+0.21j, -0.32+0.j , -0.22-0.21j, 0.8 -1.07j, 0.16+0.9j , 0.6 +0.62j, -0.43+1.37j, 0.24+0.7j , -1.17-0.32j]) (Pdb) p (randvec @ unextracting_ft).round(2) array([ 9.13-0.j , -1.17+0.32j, 0.24-0.7j , -0.43-1.37j, 0.6 -0.62j, 0.16-0.9j , 0.8 +1.07j, -0.22+0.21j, -0.32-0.j ]) The problem now must be in the inverse transform matrix, which is simpler. For some reason the real frequency data is still retaining imaginary components: (Pdb) p ((randvec @ unextracting_ft) @ extracting_ift).round(2) array([0.55-0.25j, 0.4 -0.21j, 0.34+0.43j, 0.74+0.28j, 0.5 -0.12j, 0.59+0.16j, 0.61-0.27j, 0.64-0.24j, 0.96+0.19j, 0.64-0.24j, 0.61-0.27j, 0.59+0.16j, 0.5 -0.12j, 0.74+0.28j, 0.34+0.43j, 0.4 -0.21j]) (Pdb) p randvec.round(2) array([0.55, 0.72, 0.6 , 0.54, 0.42, 0.65, 0.44, 0.89, 0.96, 0.38, 0.79, 0.53, 0.57, 0.93, 0.07, 0.09]) 0614 I'm here; it's a small and closed system: 126 if not is_complex: 127 has_nyquist = bool(freqs[-1] == 0.5) 128 head_idx = 1 if has_dc_offset else 0 129 tail_idx = len(freqs) - (1 if has_nyquist else 0) 130 131 # todo: remove test data 132 full_mat = complex_wavelet(wavelet, np.outer(np.concatenate([freqs, -freqs[head_idx:tail_idx][::-1]]), offsets)) 133 time_data = np.random.random(full_mat.shape[1]) 134 extended_freq_data = time_data @ np.linalg.inv(full_mat) 135 freq_data = extended_freq_data[:len(freqs)] (Pdb) 136 import pdb; pdb.set_trace() 137 138 mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) 139 -> mat = mat.real full_mat @ extended_freq_data == time_data correctly. However, mat @ freq_data adds imaginary components to the output. They are no longer canceling. I don't yet understand why this is, given the same parts (-freqs[head_idx,tail_idx]) are summed into the output. 0622 So, like before, I imagine a thing to do is drilling down into the matrix construction. I'll consider sample 6. 0.46147936225293185 (Pdb) p full_mat[:,6].round(3) array([ 1. -0.j , 0.223+0.975j, -0.901+0.434j, -0.623-0.782j, 0.623-0.782j, 0.901+0.434j, -0.223+0.975j, -1. +0.j , -0.223-0.975j, 0.901-0.434j, 0.623+0.782j, -0.623+0.782j, -0.901-0.434j, 0.223-0.975j, 1. -0.j , 0.223+0.975j, -0.901+0.434j, -0.623-0.782j, 0.623-0.782j, 0.901+0.434j, -0.223+0.975j, -1. +0.j , -0.223-0.975j, 0.901-0.434j, 0.623+0.782j, -0.623+0.782j, -0.901-0.434j, 0.223-0.975j]) (Pdb) p (extended_freq_data @ full_mat[:,6]).round(3) (0.461-0j) (Pdb) p time_data[6].round(3) 0.461 0635 So. Every item in that column #6, is multiplied by every frequency, and they are summed. As far as I know, that's what the code is already reproducing, so there is something I'm conceiving wrongly going on here (as is often the case). I'll start by checking the top portion. The mat that's failing is made with these 2 lines: 125 mat = complex_wavelet(wavelet, np.outer(freqs, offsets)) 138 mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) I'll check the top portion first. 0643 I failed to negate the nyquist frequency and found that the output differed because of this: (Pdb) p (extended_freq_data[:len(freqs)] @ full_mat[:len(freqs),6]).round(3) (0.458+0.004j) (Pdb) p (extended_freq_data[:len(freqs)] @ np.outer(freqs, offsets)[:len(freqs),6]).round(3) (-0.494-0.42j) (Pdb) p (extended_freq_data[:len(freqs)-1] @ np.outer(freqs, offsets)[:len(freqs)-1,6]).round(3) (-0.209-0.42j) (Pdb) p (extended_freq_data[:len(freqs)-1] @ np.outer(freqs, offsets)[:len(freqs)-1,6]).round(3) (-0.209-0.42j) Inspecting the source code, it looks like I am also failing to negate the nyquist frequency there, too. I tried inverting it and it still does not pass. 0648
2022-11-17 1038 1047 i'm poking at it, and running into a complexity trying to make an effective freq2time matrix. for a matrix to work on its own, it functions by multiplying the positive frequencies. the larger matrix has parts that multiply the negative. the negative frequencies have data that is the complex conjugate of the data associated with the positive frequencies. so, if there were a function, it would be simple to kludge it by copying the data and taking its conjugate. but i'm not immediately sure how to make a matrix that performs the complex conjugate operation on its input. y'know, i could find that quickly by simply solving for it. 1049 of course it could be true, maybe even quite likely, that a general complex conjugate is not a linear operation. if y is the complex conjugate of x and we consider, is there a matrix where y = x @ M or M @ x then ... uh .... ok the np linalg functions i am presently used to all take an existing matrix. they don't operate on just two vectors. the simplest solution is to use a function rather than a matrix, or to ignore the half-sized real-domain matrices for now. reminder: the issue demonstrates via the difference between multiplying the conjugate data by the matrix for the negative frequencies, and multiplying the non-conjugate data by the negative of the same. it possibly or likely appears that for the sinusoid, the negative functions as if the conjugate were on the left, whereas this is not true for the step wave. it would make sense to go down to a smaller scale; to consider one differing value 1058 basically my internal considering goes off in the opposite direction of use, so looking at it helps a ton. (Pdb) p pos_freqs, neg_freqs (array([0.46428571]), array([-0.46428571])) (Pdb) p pos_freq_data, neg_freq_data (array([0.05144037-0.04259059j]), array([0.05144037+0.04259059j])) (Pdb) p offset array([27]) 138 -> mat[head_idx:tail_idx,:] += complex_wavelet(wavelet, np.outer(-freqs[head_idx:tail_idx], offsets)) (Pdb) p (neg_freq_data @ complex_wavelet(COSINE, np.outer(neg_freqs, offset))).round(3) array([-0.06-0.03j]) (Pdb) p (pos_freq_data @ complex_wavelet(COSINE, np.outer(neg_freqs, offset))).round(3) array([-0.041+0.053j]) Even with the cosine, the result of multiplying the conjugate by the negative frequencies is different than multiplying the non-conjugate. The fact that so many assertions are passing implies that this is somehow made up for by a similar countering multiplication elsewhere. considering +- b and +-d (a + bi) * (c + di) = ac + bci + adi - bd = ac - bd + (ad + bc)i so then we have ac - bd + adi - bci = X * (ac - bd + bci - adi) not that confident around that, ummm the sign of "a" impacts both "ac" and "ad" so if I want the sign of "ad" to change, then _either_ "a" _or_ "d" must change in sign if i don't want ac and bd to change, then either a-and-c or d-and-b must change in sign given both conditions rely on elements from both sides of the original multiplication (a + bi) * (c + di), i can conclude that it is unreasonable to assume i could find a real constant that would multiply only one of them to make the conjugate of the result. the inference is that with the cosine, there are further frequency values that balance the sum out so it still becomes correct 1134 i ran this test and i did not find the balancing to happen; i may have made a mistake. it would be worthwhile checking a passing assert. (Pdb) p (np.concatenate([extended_freq_data[1:14], extended_freq_data[15:]]) @ complex_wavelet(COSINE, np.outer(np.concatenate([extended_freqs[1:14], extended_freqs[15:]]), offset))).round(3) array([-0.585+0.j]) (Pdb) p (freq_data[1:14] @ (complex_wavelet(COSINE, np.outer(extended_freqs[1:14], offset)) + complex_wavelet(COSINE, np.outer(extended_freqs[15:], offset))) ).round(3) array([0.+0.j]) i might have ran out of the mystery of continuing to do this immediately.
I am human being who wanted to live without luxury to aid the world to my fullest. I had a unique view of the world from refraining from urges and observing normality with questioning, all my life.
This thread represents an advancement of my "triangle" task, or an attempt to advance it. Rather than describing a triangle, I am developing utility and understanding around basic Fourier transforms. These are important to me for past struggles I have had trying to produce cheap for anybody to defend a shielded enclosures. I understand for some reason some consider shielded enclosures to be advanced security technology. I find it impossible for them to be harmful, and have never had a reason for them to be harmful expressed to me.
I expect this work to not be useful for a shielded enclosure. I expect it instead to be good practice, and possibly help me connect to more useful behaviors that communities value more. I imagine my past goals of making electromagnetic signals more clear and transparent, when I work on it.
My last project regarding the shielding use was an exploration of using a repetitive noise source as an emitter, to reduce radio circuitry needed to test shielding efficiency.
I need to be able to think things I choose, to consider things I choose, to try things I choose, to explore things I choose. Without daydreams, our plans are simply the orders of others.
1134 i ran this test and i did not find the balancing to happen; i may have made a mistake. it would be worthwhile checking a passing assert. (Pdb) p (np.concatenate([extended_freq_data[1:14], extended_freq_data[15:]]) @ complex_wavelet(COSINE, np.outer(np.concatenate([extended_freqs[1:14], extended_freqs[15:]]), offset))).round(3) array([-0.585+0.j]) (Pdb) p (freq_data[1:14] @ (complex_wavelet(COSINE, np.outer(extended_freqs[1:14], offset)) + complex_wavelet(COSINE, np.outer(extended_freqs[15:], offset))) ).round(3) array([0.+0.j])
To diagnose real-domain freq2time with step functions, it would make sense to understand how it can work with cosine functions.
{Boss steps into his office and calls up Rebel Worker 3. Boss: "Rebel Worker 3, Machine Learning Marketer has shown me how we can automate some of the mind control."}
(Pdb) p create_freq2time(freqs=fftfreq(repetition_samples=4, complex=True)).round(3) array([[ 0.25-0.j , 0.25-0.j , 0.25-0.j , 0.25-0.j ], [ 0.25-0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25-0.j , -0.25+0.j , 0.25-0.j , -0.25+0.j ], [ 0.25-0.j , -0. -0.25j, -0.25+0.j , 0. +0.25j]]) (Pdb) p create_freq2time(freqs=fftfreq(repetition_samples=4, complex=False)).round(3) array([[ 0.25, 0.25, 0.25, 0.25], [ 0.5 , -0. , -0.5 , -0. ], [ 0.25, -0.25, 0.25, -0.25]])
1723 the sinusoid data actually has the same issue as the step function data. the "real-domain" matrix produces output with some imaginary components added. (Pdb) p freq_data @ create_freq2time(freqs=fftfreq(repetition_samples=4, complex=True)) array([0.04032738-2.47552542e-17j, 0.11657769-3.24912464e-17j, 0.0633229 -2.75713933e-17j, 0.16327708-4.16797190e-17j]) (Pdb) p freq_data[:-1] @ create_freq2time(freqs=fftfreq(repetition_samples=4, complex=False)) array([0.04032738+2.33496941e-02j, 0.13992739-1.85659191e-17j, 0.0633229 -2.33496941e-02j, 0.13992739-1.85659191e-17j]) so, why isn't an assertion catching this? maybe i don't test real-domain data? actually, i do test this: rfreqs15t = fftfreq(repetition_samples=15, complex=False) rfreqs15f = fftfreq(15, complex=False) irft15 = create_freq2time(freqs=rfreqs15f) rft15 = create_time2freq(15, freqs=rfreqs15t) randvec2rtime15 = randvec[:15] @ irft15 randvec2rfreq15 = randvec[:15] @ rft15 randvec2irfft = np.fft.irfft(randvec[:15]) randvec2rfft = np.fft.rfft(randvec[:15]) assert np.allclose(rfreqs15t, np.fft.rfftfreq(15)) assert np.allclose(rfreqs15f, np.fft.rfftfreq(28)) assert np.allclose(randvec2rtime15, randvec2irfft) assert np.allclose(randvec2rfreq15, randvec2rfft) onward to the passing assertions. 1725. This was so hard to approach I am just quite satisfied to have added a little more !!
assert np.allclose(randvec2fft, np.linalg.solve(ift16.T, randvec)) import pdb; pdb.set_trace() rfreqs15t = fftfreq(repetition_samples=15, complex=False) [1729] 1731 In looking at the assertions, I notice I'm not testing against complex frequency data. This is likely why it's not caught. internal experiences again. having {great} trouble staying. [1732]
I feel confident there is a resolution to the real-domain matrix situation. I would like to add a complex-domain test to the assertions, to help me comprehend that.
this is a 4x4 inverse dft matrix: array([[ 0.25-0.j , 0.25-0.j , 0.25-0.j , 0.25-0.j ], [ 0.25-0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25-0.j , -0.25+0.j , 0.25-0.j , -0.25+0.j ], [ 0.25-0.j , -0. -0.25j, -0.25+0.j , 0. +0.25j]]) each column is a vector in the complex plane of length 1/4 rotating by increasing multiples of 90 degrees. So is each row.
--- what's feeling reasonable for me here is having the real case function differently from the imaginary case, in that (for now) the user must .view(float) the complex frequency data to process it. the idea of this code would be to bundle the functionality into a class, so it doesn't matter too much what the interface is, but it is nice ot have it small and somewhat repluggable with numpy. 0425 0435 Interestingly, numpy's real-domain FFT is not idempotent for odd-lengthed vectors ! The code assumes an even-lengthed vector. Here you can see me performing a numpy real-domain FFT on a 15-lengthed vector, then a numpy real-domain inverse FFT, and getting back a 14-lengthed vector with differing content: (Pdb) p np.fft.rfft(randvec[:15]).shape (8,) (Pdb) p np.fft.irfft(np.fft.rfft(randvec[:15])).shape (14,) (Pdb) p np.fft.irfft(np.fft.rfft(randvec[:15])) array([0.56987486, 0.76018321, 0.64798214, 0.52903067, 0.52608816, 0.63919848, 0.55517019, 1.19692063, 0.57350214, 0.66362825, 0.74056271, 0.47672495, 1.03527704, 0.1288166 ]) (Pdb) p randvec[:15] array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606]) I checked the help and it explains that there are further parameters to provide for this. The help also mentions the Hermitian-symmetry i discovered and learned of by poking at this. ----- Returns the real valued `n`-point inverse discrete Fourier transform of `a`, where `a` contains the non-negative frequency terms of a Hermitian-symmetric sequence. `n` is the length of the result, not the input. If you specify an `n` such that `a` must be zero-padded or truncated, the extra/removed values will be added/removed at high frequencies. One can thus resample a series to `m` points via Fourier interpolation by: ``a_resamp = irfft(rfft(a), m)``. The correct interpretation of the hermitian input depends on the length of the original data, as given by `n`. This is because each input shape could correspond to either an odd or even length signal. By default, `irfft` assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the correct length of the real input **must** be given. 0443 i have a failing assertion at the start of the file now. on to make a real-valued matrix. 0456 array([[ 0.25-0.j , 0.25-0.j , 0.25-0.j , 0.25-0.j ], [ 0.25-0.j , 0. +0.25j, -0.25+0.j , -0. -0.25j], [ 0.25-0.j , -0.25+0.j , 0.25-0.j , -0.25+0.j ], [ 0.25-0.j , -0. -0.25j, -0.25+0.j , 0. +0.25j]]) thinking about how to do multiplication 0515 so, basically the output is real * real - 2 * imag * imag i'm making a matrix, but i'm confused about reshaping it. the complex matrix has [freq, sample], where freq = sample = 28 the real matrix will have shape [freq_components, sample], where freq_components = 30 (2*15). the rows of the real matrix will be composed of alternating real and imaginary parts. the imaginary parts will be negated. and for the doubled frequencies, the imaginary parts will be doubled. the axes are numbered starting from 0. so 0 = frequency, and 1 = sample. numpy has "concatenate" and "stack" operations. if i make a complex matrix, to troubleshoot-bisect clearly with working code, i could then extract the real and imaginary components with mat.real and mat.imag, or with .view(float) . .view(float) would extract them adjacent to each-other in rows, increasing the sample count in the matrix. i want them adjacent to each-other in columns. and .view only works if the matrix is not transposed, kinda. so i might have the real and imaginary components as two separate matrices. i then want to interweave the columns of those matrices. first one value, then another. a way to interwave is to place them adjacent to each other and then reshape the matrix to exclude the dimension -- _i think_; i might be misthinking something here. for the reshape-to-interweave path, the matrix axis to be interwoven needs to be the last one, or at least the smallest index in the underlying storage. so i might transpose the matrix, to make frequencies be rows, and samples be columns. once frequencies are rows, i would add a further dimension of complex and imaginary. this is equivalent to transposing and then doing .view() i think a simple approach would be complex_mat.T.copy().view(float).T . this then leaves out negating the imaginary component. so maybe not using .view() could be clearer to implement. [i could also negate the frequency passed to the wave function, or swap them] here's what i've got:
a array([[1.+2.j, 3.+4.j], [5.+6.j, 7.+8.j]]) np.stack([a.real.T, -a.imag.T], axis=-1).reshape(a.shape[1], a.shape[0]*2).T array([[ 1., 3.], [-2., -4.], [ 5., 7.], [-6., -8.]])
.T is transpose. 0527 0536
battery is at 45% ! 0541 i fixed a mistake and two more assertions are passing. it's paused on the failing one: 269 rfreqs15t = fftfreq(repetition_samples=15, complex=False) 270 rfreqs15f = fftfreq(15, complex=False) 271 irft_from_15 = create_freq2time(freqs=rfreqs15f) 272 rft_from_15 = create_time2freq(15, freqs=rfreqs15t) 273 irft_to_15 = create_freq2time(15, freqs=rfreqs15t) 274 randvec2rtime15 = randvec[:15].astype(np.complex128).view(float) @ irft_from_15 275 randvec2rfreq15 = randvec[:15] @ rft_from_15 276 randvec2irfft = np.fft.irfft(randvec[:15]) 277 randvec2rfft = np.fft.rfft(randvec[:15]) 278 randvec2rfreq152randvec = randvec2rfreq15.view(float) @ irft_to_15 279 randvec2rfreq152irfft = np.fft.irfft(randvec2rfreq15, 15) 280 assert np.allclose(rfreqs15t, np.fft.rfftfreq(15)) 281 assert np.allclose(rfreqs15f, np.fft.rfftfreq(28)) 282 assert np.allclose(randvec2rtime15, randvec2irfft) 283 assert np.allclose(randvec2rfreq15, randvec2rfft) 284 -> assert np.allclose(randvec2rfreq152randvec, randvec2rfreq152irfft) 285 assert np.alloclose(randvec2rfreq152randvec, randvec[:15]) This is likely the time-domain result of complex frequencies. my plan was to break in the matrix generation, and run test data through the steps to see when it differed. 134 complex_freqs = np.concatenate([ 135 freqs[:tail_idx], 136 -freqs[tail_idx:], 137 -freqs[head_idx:tail_idx][::-1] 138 ]) 139 complex_mat = complex_wavelet(wavelet, np.outer(complex_freqs, offsets)) 162 interim_mat = complex_mat.copy() 163 interim_mat[head_idx:tail_idx] += interim_mat[len(freqs):][::-1] 164 mat = np.stack([ 165 interim_mat[:len(freqs)].real.T, 166 -interim_mat[:len(freqs)].imag.T 167 ], axis=-1).reshape(len(complex_freqs), len(freqs)*2).T 185 return mat / freq_count # scaled to match numpy convention note to self: the help said you can change when that / freq_count normalization happens in numpy, i believe. (Pdb) break 162 Breakpoint 1 at /shared/src/scratch/fourier.py:162 -> interim_mat = complex_mat.copy() (Pdb) time_data = np.random.random(complex_mat.shape[-1]) (Pdb) complex_freq_data = time_data @ np.linalg.inv(complex_mat) (Pdb) real_freq_data = np.concatenate([complex_freq_data[:tail_idx], complex_freq_data[tail_idx:len(freqs)].conj()]) ... (Pdb) p complex_freq_data @ complex_mat array([0.0202184 +1.38777878e-16j, 0.83261985-2.77555756e-17j, 0.77815675+2.77555756e-16j, 0.87001215-1.38777878e-16j, 0.97861834-1.24900090e-16j, 0.79915856-1.94289029e-16j, 0.46147936-2.22044605e-16j, 0.78052918-1.80411242e-16j, 0.11827443-1.04083409e-16j, 0.63992102+3.60822483e-16j, 0.14335329-1.04083409e-16j, 0.94466892-1.87350135e-16j, 0.52184832+2.74086309e-16j, 0.41466194+0.00000000e+00j, 0.26455561-1.73472348e-17j, 0.77423369+5.55111512e-17j, 0.45615033-5.13478149e-16j, 0.56843395+2.77555756e-16j, 0.0187898 -4.16333634e-17j, 0.6176355 -1.66533454e-16j, 0.61209572+1.66533454e-16j, 0.616934 -3.33066907e-16j, 0.94374808+6.24500451e-17j, 0.6818203 +2.84494650e-16j, 0.3595079 +2.63677968e-16j, 0.43703195+3.60822483e-16j, 0.6976312 -2.70616862e-16j, 0.06022547+3.60822483e-16j]) (Pdb) p real_freq_data.view(float) @ mat array([0.0202184 , 0.44642266, 0.73789397, 0.65352205, 0.66906312, 0.74048943, 0.70261372, 0.69873159, 0.36518507, 0.62877826, 0.08107154, 0.75655143, 0.48899933, 0.59444781, 0.26455561, 0.59444781, 0.48899933, 0.75655143, 0.08107154, 0.62877826, 0.36518507, 0.69873159, 0.70261372, 0.74048943, 0.66906312, 0.65352205, 0.73789397, 0.44642266]) i'll pick sample index 6 to debug again i think. (Pdb) p (complex_freq_data @ complex_mat)[6] (0.4614793622529322-2.220446049250313e-16j) (Pdb) p (real_freq_data.view(float) @ mat)[6] 0.7026137203837779 (Pdb) p complex_freq_data @ complex_mat[:,6] (0.4614793622529323-2.6020852139652106e-16j) (Pdb) p real_freq_data.view(float) @ mat[:,6] 0.7026137203837779 I tried the nyquist frequency, and it looks like it's working right: (Pdb) p complex_freq_data[tail_idx] * complex_mat[tail_idx,6] (-0.09512353353755763-2.275865808140467e-17j) (Pdb) p real_freq_data.view(float)[tail_idx*2:] @ mat[tail_idx*2:,6] -0.09512353353755763 so I guess the problem is in combining the conjugates in the interim matrix. (Pdb) p complex_mat[:,6][head_idx:tail_idx].round(3) array([ 0.223+0.975j, -0.901+0.434j, -0.623-0.782j, 0.623-0.782j, 0.901+0.434j, -0.223+0.975j, -1. +0.j , -0.223-0.975j, 0.901-0.434j, 0.623+0.782j, -0.623+0.782j, -0.901-0.434j, 0.223-0.975j]) (Pdb) p complex_mat[:,6][len(freqs):][::-1].round(3) array([ 0.223-0.975j, -0.901-0.434j, -0.623+0.782j, 0.623+0.782j, 0.901-0.434j, -0.223-0.975j, -1. +0.j , -0.223+0.975j, 0.901+0.434j, 0.623-0.782j, -0.623-0.782j, -0.901+0.434j, 0.223+0.975j]) ok, yeah, as conjugates, i lose information if I just sum them in complex space. i need to negate the imaginary parts of the conjugates. 0556 interim_mat = complex_mat.copy() interim_mat[head_idx:tail_idx].real += interim_mat[len(freqs):][::-1].real interim_mat[head_idx:tail_idx].imag -= interim_mat[len(freqs):][::-1].imag mat = np.stack([ interim_mat[:len(freqs)].real.T, -interim_mat[:len(freqs)].imag.T ], axis=-1).reshape(len(complex_freqs), len(freqs)*2).T 0600 interim_mat = complex_mat.copy() interim_mat[head_idx:tail_idx] += interim_mat[len(freqs):][::-1].conj() mat = np.stack([ interim_mat[:len(freqs)].real.T, -interim_mat[:len(freqs)].imag.T ], axis=-1).reshape(len(complex_freqs), len(freqs)*2).T 0603 the same assertion is still failing. i figured i had just done the wrong fix or poorly, but sample 6 is matching now: (Pdb) p complex_freq_data[:] @ complex_mat[:,6] (0.4614793622529323-2.6020852139652106e-16j) (Pdb) p real_freq_data.view(float)[:] @ mat[:,6] 0.46147936225293196 looks like they all are now, which is good: (Pdb) p (complex_freq_data @ complex_mat).round(3) array([0.02 +0.j, 0.833-0.j, 0.778+0.j, 0.87 -0.j, 0.979-0.j, 0.799-0.j, 0.461-0.j, 0.781-0.j, 0.118-0.j, 0.64 +0.j, 0.143-0.j, 0.945-0.j, 0.522+0.j, 0.415+0.j, 0.265-0.j, 0.774+0.j, 0.456-0.j, 0.568+0.j, 0.019-0.j, 0.618-0.j, 0.612+0.j, 0.617-0.j, 0.944+0.j, 0.682+0.j, 0.36 +0.j, 0.437+0.j, 0.698-0.j, 0.06 +0.j]) (Pdb) p (real_freq_data.view(float) @ mat).round(3) array([0.02 , 0.833, 0.778, 0.87 , 0.979, 0.799, 0.461, 0.781, 0.118, 0.64 , 0.143, 0.945, 0.522, 0.415, 0.265, 0.774, 0.456, 0.568, 0.019, 0.618, 0.612, 0.617, 0.944, 0.682, 0.36 , 0.437, 0.698, 0.06 ]) i tested both generations in their generation function that they made the same output as a complex matrix, and they appeared to to me. these two values are supposed to be the same but are differing: 278 randvec2rfreq152randvec = randvec2rfreq15.view(float) @ irft_to_15 279 randvec2rfreq152irfft = np.fft.irfft(randvec2rfreq15, 15) i've bumped into a couple different likely mistakes that could cause that difference. 0608 0621 summary of failure: vector.view(float) @ create_freq2time(15, freqs=fftfreq(repetition_samples=15,complex=False)) is producing different output than np.fft.irfft(vector, 15) - there is no nyquist frequency - the real frequencies used are rfreqs15t - the test vector is randvec2rfreq15 - the failing matrix is constructed as irft_to_15 on line 273 so, inside the call on line 273, i should expect the final matrix to have the same effect as np.fft.irfft(_, 15) (Pdb) list 134 complex_freqs = np.concatenate([ 135 freqs[:tail_idx], 136 -freqs[tail_idx:], 137 -freqs[head_idx:tail_idx][::-1] 138 ]) 139 -> complex_mat = complex_wavelet(wavelet, np.outer(complex_freqs, offsets)) 140 141 # assuming the input's real and complex components are broken out into separate elements 142 143 # real_in, imag_in 144 # multiplied by conjugate pair and summed (Pdb) p is_complex, has_dc_offset, has_nyquist (False, True, False) (Pdb) p complex_freqs array([ 0. , 0.06666667, 0.13333333, 0.2 , 0.26666667, 0.33333333, 0.4 , 0.46666667, -0.46666667, -0.4 , -0.33333333, -0.26666667, -0.2 , -0.13333333, -0.06666667]) Something is different before the conversion to real: (Pdb) time_data = np.random.random(15) (Pdb) complex_freq_data = np.fft.fft(time_data) (Pdb) p np.fft.ifft(complex_freq_data).round(3) array([0.02 -0.j, 0.833-0.j, 0.778-0.j, 0.87 +0.j, 0.979+0.j, 0.799+0.j, 0.461-0.j, 0.781+0.j, 0.118-0.j, 0.64 -0.j, 0.143+0.j, 0.945-0.j, 0.522+0.j, 0.415-0.j, 0.265+0.j]) (Pdb) p (complex_freq_data @ complex_mat / freq_count).round(3) array([0.022-0.j, 0.892-0.j, 0.834-0.j, 0.932-0.j, 1.049-0.j, 0.856+0.j, 0.494-0.j, 0.836-0.j, 0.127-0.j, 0.686-0.j, 0.154+0.j, 1.012-0.j, 0.559-0.j, 0.444-0.j, 0.283+0.j]) It's strange to encounter this. I wonder what obscure error I have made for myself. The frequencies used to make the matrix are the same: (Pdb) p complex_freqs array([ 0. , 0.06666667, 0.13333333, 0.2 , 0.26666667, 0.33333333, 0.4 , 0.46666667, -0.46666667, -0.4 , -0.33333333, -0.26666667, -0.2 , -0.13333333, -0.06666667]) (Pdb) p np.fft.fftfreq(len(complex_freq_data)) array([ 0. , 0.06666667, 0.13333333, 0.2 , 0.26666667, 0.33333333, 0.4 , 0.46666667, -0.46666667, -0.4 , -0.33333333, -0.26666667, -0.2 , -0.13333333, -0.06666667]) Here's the line again it's made with: 139 complex_mat = complex_wavelet(wavelet, np.outer(complex_freqs, offsets)) I guess it's back down to comparing a single sample or whatnot. I'll compare sample 6 again. 0634 (Pdb) p (complex_freq_data @ complex_mat[:,6]) / np.fft.ifft(complex_freq_data)[6] (15.000000000000009-1.8056631630774395e-15j) (Pdb) p freq_count 14 Looks like I'm dividing by the wrong denominator. 0636 here's how freq_count is originally made: 117 freq_count = len(freqs) 118 if not has_dc_offset: 119 freq_count += 1 120 if not is_complex: 121 freq_count = (freq_count - 1) * 2 (Pdb) p len(freqs) 8 (Pdb) p has_dc_offset True (Pdb) p is_complex False (Pdb) p (8 - 1) * 2 14 I'm kind of running into the same situation numpy had, but I've already established a workaround of checking the frequency content to handle it. Where did these freqs come from? 269 rfreqs15t = fftfreq(repetition_samples=15, complex=False) They were generated as real-domain frequencies for 15 samples. (Pdb) p np.fft.rfftfreq(15) array([0. , 0.06666667, 0.13333333, 0.2 , 0.26666667, 0.33333333, 0.4 , 0.46666667]) (Pdb) p len(np.fft.rfftfreq(15)) 8 Looking at their values, one can tell that ther is a DC offset, but there is no nyquist frequency. So the total number would be 7*2 (conjugates) + 1 (dc offset) = 15. If there were a nyquist, it would be + another 1. Here: 118 if not has_dc_offset: 119 freq_count += 1 120 if not is_complex: 121 freq_count = (freq_count - 1) * 2 It appears to me to be incorrect to increment freq_count before doubling it, as it is unaddressed as to whether there is a nyquist frequency or not. The doubling in general is assuming both a nyquist and a DC, but in odd frequency lengths there is never a nyquist, even in the complex domain. I tihnk I just copied that (n - 1) * 2 expression from the numpy docs while handling different kinds of confusion. 0642 0643 no, my analysis above is wrong. 118 if not has_dc_offset: 119 freq_count += 1 here, it has a DC offset, so this code isn't hit. otherwise it would be adjusting for the below to work 120 if not is_complex: 121 freq_count = (freq_count - 1) * 2 here, it ignores the DC offset, and doubles the remainder. ok umm let me look at it again 8 - 1 = 7. 7 * 2 = 14. then it would need to add 1 to bring back the DC offset, which it isn't doing. whta if there's a nyquist? 8(nyquist+dc+6 conjugates) - 1 = 7. 7 * 2 = 14 (6*2+2 = 14) it's correct if there's a nyquist. if there is not a nyquist, it should be one more. I did this, but I'm not confident that it's the right solution, I don't fully understand it. stlil, it seemed like the immediate bandaid: if not is_complex: has_nyquist = bool(freqs[-1] == 0.5) freq_count = (freq_count - 1) * 2 + has_nyquist 0648 ok the code is now crashing with data that has a nyquist ok i had inverted my condition. it took me some exploring in pdb to identify this. 0651. if not is_complex: has_nyquist = bool(freqs[-1] == 0.5) freq_count = (freq_count - 1) * 2 + (not has_nyquist) That assertion is now passing. 0652. 0654 the new code is not making the right matrix for the longvec situation where there are way more samples than frequencies. i feel much more confident engaging this after the previous work, but i wanted to note the frequencies. (Pdb) p longspace_freqs array([0. , 0.00270655, 0.0054131 , 0.00811965, 0.0108262 , 0.01353275, 0.0162393 , 0.01894585, 0.0216524 ]) There is a DC offset, no immediately visible nyquist, and the frequencies are real domain. It's notable here, that since the upper frequency has changed, the presence of 0.5 in the middle bin no longer indicates whether there was an odd or even number of original frequencies :/ . Checking 0.5 was a quick hack to keep using the same code layout, but it was a quick hack I was hoping worked. It's notable that in this situation, the repetition_samples (output sample count of the frequency matrix) is actually a floating point value. So I am likely using freq_count in a way that isn't appropriate here. I probably mean to use time_count . 0658 ValueError: cannot reshape array of size 28800 into shape (17,18) Uncaught exception. Entering post mortem debugging Running 'cont' or 'step' will restart the program
/shared/src/scratch/fourier.py(167)create_freq2time() -> ], axis=-1).reshape(len(complex_freqs), len(freqs)*2).T (Pdb) list 162 interim_mat = complex_mat.copy() 163 interim_mat[head_idx:tail_idx] += interim_mat[len(freqs):][::-1].conj() 164 mat = np.stack([ 165 interim_mat[:len(freqs)].real.T, 166 -interim_mat[:len(freqs)].imag.T 167 -> ], axis=-1).reshape(len(complex_freqs), len(freqs)*2).T
The output array has frequencies as columns (first dimension) and samples as rows (second dimension). It looks like I'm using len(complex_freqs) where I should be using time_count . note: - check for other (n-1)*2 uses and fix them - check for uses of is_nyquist and consider them for fractional repetition_samples or manual time_count looks like that was the only (n-1)*2 i changed the matrix shape to use time_count regarding is_nyquist, this will mean freq_count is wrong for subsignal matrices with a nyquist. do they happen? when do they happen? i think nyquists happen when there is an even number of complex frequency bins? if freq_count % 2 == 0: if complex: neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True) pos_freqs = -neg_freqs[:0:-1] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = pos_freqs[:0] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = -pos_freqs[::-1] if complex else pos_freqs[:0] Yes. When an even number of frequency bins was originally specified, then a nyquist is included. Its value is max_freq, and it has no complex conjugate. Compare to when an odd number of frequency bins is originally specified; now there is no nyquist and max_freq has a complex conjugate. When only real-domain frequencies are output, they are all linearly spaced aside from the DC offset. So it looks like the information on whether there is a nyquist or not is not within the list of real-domain frequencies. [a fudgy way to include this data would be to negate the nyquist, like it is in the complex frequencies] a way to back negating the nyquist would be that a user generally doesn't want less data from the same storage. given it is quite reasonable to take the complex conjugate of the nyquist when it is not actually the nyquist frequency but simply a middle frequency of something else, the user may want to assume the larger amount of frequency bins underlying the data, to extract more information on it in more detail. does this make a difference ?? ... i think it means that 1 more sample is needed to process a single repetition. ... i'm not certain at this time what it means beyond that. i guess for now i'll issue a warning or raise an exception in that case. if not complex: raise NotImplementedError("the time2freq functions do not presently retain the information that an odd freq_count is associated with real-domain data") i raised the exception in this case because even freq_counts are more normative. so, given the freq_count is even, this means there is indeed a frequency with no complex conjugate, been calling that nyquist ... so i can just set that value to True 0734 it looks like my code here: if not complex: freq_count = int((freq_count - 1) * 2) actually usually prevents there from being an odd freq_count in the real domain 0738 trying to comprehend that there is a case where an odd freq_count happens. when repetition_samples=15 and complex=False are passed to fftfreq, it basically ends up calculating freq_count = repetition_samples . this makes for an odd freq_count and real-domain data it's kind of an obscurity of the test failing. it passes 15 to the numpy functions too. uhhh i'll just bring back the odd_freq_count bool parameter for now 0740 0744 ok, now normal assertions are failing again. also i added a fun warning for when odd+noncomplex happens. 299 -> assert np.allclose(randvec2rfreq152randvec, randvec2rfreq152irfft) 293 randvec2rfreq152randvec = randvec2rfreq15.view(float) @ irft_to_15 294 randvec2rfreq152irfft = np.fft.irfft(randvec2rfreq15, 15) 287 rft_from_15 = create_time2freq(15, freqs=rfreqs15t, odd_freq_count=True) 288 irft_to_15 = create_freq2time(15, freqs=rfreqs15t) oops, gotta pass that parameter i made for the second use too. 0748 all the basic asserts are passing again, and additionally the longvec matrices are created without crashing (the associated asserts are still failing, and i've changed things they use so that's roughly expected) guess i'll attach my mess and poke at the real world a little
0951 I'm on 321 -> assert np.allclose(longvec, inserting_spectrum @ inserting_ift) . It looks like the two vectors are actually roughly matching: (Pdb) p shortspace_freqs / longspace_freqs array([ nan, 23.09213165, 23.09213165, 23.09213165, 23.09213165, 23.09213165, 23.09213165, 23.09213165, 23.09213165]) (Pdb) p (inserting_spectrum @ inserting_ift)[23] 0.71518936637242 (Pdb) p randvec[1] 0.7151893663724195 But not at interim locations: (Pdb) p (inserting_spectrum @ inserting_ift)[12] 0.6963222972034951 I was imagining the step function would somehow magically break the indexing the same one I was, but it actually changes different frequencies at different samples, such that there is still interpolation between each value. (Pdb) p np.count_nonzero((inserting_ift[:,:-1] == inserting_ift[:,1:]).all(axis=0)) 815 (Pdb) p inserting_ift.shape[-1] 1600 A little under half of the samples are interpolated with their neighbors. Between 5 and 6, only 1 frequency changed. I imagine this is likely normal: (Pdb) p inserting_ift[:,5] array([ 0.0625, -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0.125 , 0.0625, 0.0625]) (Pdb) p inserting_ift[:,6] array([ 0.0625, -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0.125 , 0.125 , -0.125 , 0.0625, 0.0625]) (Pdb) p longspace_freqs array([0. , 0.00270655, 0.0054131 , 0.00811965, 0.0108262 , 0.01353275, 0.0162393 , 0.01894585, 0.0216524 ]) I'm using a 3-valued function now, of course. 1007 I tried changing to a square wave in different ways but i wasn't understanding things quite right. i think i'd need to remove some of the restrictions on the wavelets (which i wrote wrongly). that would be easier if i noted what they stemmed from. anyway, it's notable here that the long-output frequencies are an even multiple of the same-length-output frequencies. the data is basically the same. the interpolation arises only because the wavelets evaluate differently in the inbetween samples. I guess it's interesting to think about that, although I wonder how important it is. Each wavelet basically has a phase. The phase is selected in order to match best the data that is available, so each wavelet will change sign at some relatively arbitrary point between the samples. This is because the model of the signal is as a sum of arbitrarily-phased waves, most of which are significantly below the nyquist frequency and hence cross many samples. A model that would more accurately reflect the nearest-neighbor sampling I quickly wrote, might be different. For one thing it wouldn't allow for phases that crossed between samples. It could also use pulses rather than waves that span the whole length. It could also use more instances of higher frequencies. The most important component would be not allowing for phases that imply nonintegral sample index changes. The phase is _roughly_ the arctangent of the real and imaginary component, both of which are linearly produced from the input based on the values of the function offset by 90 degrees. So the subinteger phases seem to come first from the fact that 90 degrees off a sample index is a subinteger sample, and second that the longvector output produces many many subinteger samples that are evaluated. Is this right? Maybe not? The subinteger phases are produced by the inverse of the matrix that uses the wave value at 90 degrees. It's further confusing stuff. Everything is so confusing when it gets all inhibited. I like atm that the puzzle is mostly abstract. I don't need to e.g. buy a motor or something, to test it. This means I can work on it engaging fewer of my behavioral inhibitions. The general question is: is it reasonable to model nearest-neighbor sampling using something like these matrices, based on fourier transforms? Or would that be a different approach? I'm imagining that say I have a signal at 40 Hz that is a step function, changing only at 1/40s, and then I sample it at say 37 or 82 Hz: I'd like to inform the decoder that the signal is a 40 Hz step function and recover it with precision. 1021
1024 With this approach, I'm thinking of how I noted the frequencies are basically the same: one is just evaluated in more places. So the situation where wavelet parameters are placed into a matrix could likely be changed to meet the goal. It's a little confusing in that complex numbers are stored there. Basically, each wavelet has two parameters: its real magnitude, and its imaginary magnitude. These are then combined to produce its output wave. Right? Wrong? Let's find out: - The frequency data has a real and imaginary magnitude for each frequency - The matrices contain real and imaginary magnitudes for each frequency/offset pair So the wavelet is first encoded into the matrix with its 0 degree and 90 degree values stored as real and complex components. Then the wavelet is reproduced using the real and imaginary magnitudes. For real-valued waves, though, it's simpler, because of the complex conjugate situation. Each 0 degree value is paired with positive and negative 90 degree values. The output becomes (real_magnitude * real_value - imag_magnitude * imag_value) * 2 . That works fine with sinusoids, I think? I think it's something like (real_mag * cos(x) - imag_mag * sin(x)) * 2? Actually, that doesn't look quite right. I'm testing it with numbers and it kind of looks like changing the phase changes the frequency ...
x = np.arange(16)*np.pi*4/16; ((np.cos(x) - np.sin(x))*2).round(2) array([ 2. , 0. , -2. , -2.83, -2. , -0. , 2. , 2.83, 2. , 0. , -2. , -2.83, -2. , 0. , 2. , 2.83])
I've made a mistake, as the magnitude is now 2.83 and the period is shorter than 2pi. 1035 when the matrix is generated for cosine, i believe the real component is the cosine samp*freq, and the imaginary component is the sine of samp*freq . making the spirals. then, the input needs to phase shift that. i'm mostly concerned with real-valued signals atm, so i can simplify the math. the frequency magnitudes will by m_r + m_i j and m_r - m_i j then i think in the matrix, it also has s_r + s_i j and s_r - s_i j ? so the output sample value is (m_r + m_i j) * (s_r + s_i j) + (m_r - m_i j) * (s_r - s_i j) expanding -> m_r * (s_r + s_i j) + m_i j * (s_r + s_i j) + m_r * (s_r - s_i j) - m_i j * (s_r - s_i j) -> m_r * s_r + m_r * s_i j + m_i j * s_r + m_i j * s_i j + m_r * s_r - m_r * s_i j - m_i j * s_r - m_i j * s_i j -> m_r s_r + m_r s_i j + m_i s_r j - m_i s_i + m_r s_r - m_r s_i j - m_i s_r j + m_i s_i -> 2 m_r s_r ok, um, i just got that all the imaginary parts cancel. i must have made another mistake? - in the code, i have written that the negative frequencies are stored in their matrices as conjugates. this makes sense because the real value is an even function, and the imaginary value is an odd function. - when i take an fft, i see that the frequency magnitudes are also conjugates - in the code, i have written to subtract the imaginary product from the real product ok, i think i failed to handle a double negative sign in the expansion. trying again. [1050] -> (m_r + m_i j) * (s_r + s_i j) + (m_r - m_i j) * (s_r - s_i j) -> m_r * (s_r + s_i j) + m_i j * (s_r + s_i j) + m_r * (s_r - s_i j) - m_i j * (s_r - s_i j) -> m_r * s_r + m_r * s_i j + m_i j * s_r + m_i j * s_i j + m_r * s_r - m_r * s_i j - m_i j * s_r + m_i j * s_i j there's the double-negative i missed. - m_i j * - s_i j == + m_i j * s_i j -> m_r s_r + m_r s_i j + m_i s_r j - m_i s_i + m_r s_r - m_r s_i j - m_i s_r j - m_i s_i -> 2 m_r s_r - 2 m_i s_i there we go. whew. So it takes the real magnitude and multiplies that by the 0 degree value, and the imaginary magnitude and multiplies that by the 90 degree value, and does indeed subtract the other from the one. It's clear that works for 90 degree phases, because then we get the -90 degree value, rather than the 0 degree value. For 45 degree phases, it seems a little different? Maybe not? I think the key could be related to the imaginary and complex parts of the spectrum being already axially projected: they're already cosines and sines. So if the phase is phi, then then the imaginary magnitude is sin phi, and the real magnitude is cos phi. If we imagine the wave as an actual spiral, one can finally see how the phase comes out. The magnitude is the coefficients of sine and cosine. The actual magnitudes of the sine and cosine of the angle. So multiplying them by the sine or by the cosine of the wave, can actually reproduce it: scaling each component appropriately. Each side of the matrix is kind of asking, how much is the cosine scaled? and how much is the sine scaled? and then they reproduce those. [... then why is the wave produced via cos - sin? instead of some kind of dot product maybe?] [ ok .. uh .. how are real-domain waves produced. what if the magnitude is 1.0 for both sin and cosine. this makes for cos(time) - sin(time) ... cos(time) - sin(time) are going to have minima and maxima whenever their values are most extreme, and they're in-phase, so i think this would be at 45 degrees, not sure .. i tried some numbers and it looks like 3pi/4 is a rough minimum. that is indeed about 45 degrees. this minimum value is apparently -sqrt(2). maybe there is some trigonometric identity that converts that, not sure. but sqrt(2) is indeed the magnitude of [1,1]. so that's the right magnitude. and then 45 degrees is the right phase. it's interesting that x*cos(t) - y*sin(t) projects on time that way, making a sine wave with phase equal to atan2(y, x) and magnitude equal to sqrt(x*x+y*y). although i only checked with 45 degrees, this is apparently true since the fourier transform assumes it. ] it would help me figure this out to understand why that happens: why it is that x *cos(t) - y*sin(t) makes a real-domain sine wave with phase == atan2(y, x). I guess I could think of it as a linear interpolation across 90 degrees. Considering that, I can almost see how it might work. We can think of sin(t) == 0 point as one projection line, and the cos(t) == 0 point as another projection line. It interpolates in a circular manner, projecting between these. Let's see, um ... so it starts at sin(t) == 0, and here it projects solely the cosine component. As it moves from sin(t) == 0, increasing t until cos(t) == 0, the projection changes. The cosine component reduces, and the sine component increases. The amounts these happen are exactly proportional to the orthographic projection of a circle, cylinder, or spiral, when viewed from the side and rotated. So, two things are changing. "t" is changing as we move from sin(t)=0 to cos(t)=0. At the same time, "phi" changes as we change the phase fo the wave from m_s = 0 to m_c = 0; that is, from sin(phi) = 0 to cos(phi) = 0. Each of these projections is the X and Y coordinates of an angle. [uhhhhh !!!] so we might think of the angle as a rod sticking out of a center. It's projected onto two dimensions: the X axis and the Y axis. Each of these two values is two views of the same rod reprsenting the angle. This rod could instead be a point along a spiral, or a plane cutting a cylinder. The cylinder view could overlay a view of the wvae, but I'm not yet sure if that choice helps things line up. Then when we move across time, the same thing is happening. There is some rod, that is now spinning with angle t, and if the wave is charted we see its Y or X coordinate move up and down. We only see one projection of this one, but it is calculated from both. it is so hard to hold these things in my mind. OK, so that X or Y coordinate is made from an underlying X or Y coordinate used to calculate it. There is a "model wave" that might better be considered something other than a wave for this, that has _both_ X and Y coordinates. These X and Y coordinate values help it track where it is. grrrrrr ok um 3 parts - the magnitude and phase - the model wave - the actual wave for the magnitude, we have the X and Y coordinates. We can view it from the X axis, or from the Y. for the model wave, we also have the X and Y coordinates at every point the actual wave is X_mag * X_model - Y_mag * Y_model one thing we can think of is how the phases line up as X_mag goes to 0, this means that the phase is entering all of the Y_mag space this aligns with where Y_model is 0 or 1 i've written and looked at that repeatedly if we are at a shallow angle, say X_mag is small and Y_mag is big this means we are only a bit off from being aligned with X_model so, X_model has a little bit, but it is mostly Y_model so what's key here is that X_mag is the cosine of phi and Y_mag is the sine of phi whereas X_model is the cosine of time and Y_model is the sine of time then after that: sines and cosines are orthogonal projections of rotations. i'm having a lot of inhibition around this conclusion and i'm not sure all the parts are present enough to write enough logical steps to retain it. but i know there is a way. so not only are X_mag and Y_mag the cosine and sine of phi, the wave phase, they are also the X and Y projections of it, and can be used to scale something into that angle. They can also be used to perform those orthogonal projections, or to rotate and scale an axial vector into their space. well, skipping a lot of projection, i tried websearching for "rotation formula" and found cos theta - sin theta . basically, multiplying an axis by cos theta - sin theta projects a rotation of a vector lying in that axis, of theta, back onto the axis. so no longer a need for projection planes that spin at 90 degree offsets around cylinders and spirals and rods. so what this result, 2(cos phi * cos t - sin phi * sin t) is doing, is rotating the real component of the wave by phi and by t both, treating it as a 2-dimension vector. so if my square wave spews out a step function, that step function will be treated as a vector to then be rotated and reprojected. it's not really accurate to use just the step function, when the phase shifting is still mag_x * wave(t) - mag_y * wave(t+0.25) . and this linear product is performed in the matrix multiplication, so if i were to draft massaging it into sinusoidal space then back to square wave space, it would presently add code outside the implementation functions. the phase change is made by applying the rotation function to the sampled model wave, and directly taking and using the output. phase change is made by applying the rotation function (cos * a - sin * b) to the sampled model wave, and directly taking and using the output. I think this means that as the phase is changed, the output wave will actually smoothly slide up and down, if it's a step function! For my idea of using square waves or step functions, I imagine a phase change shifting them left and right, not sliding them up and down. This makes it relatively clear that non-sinusoid wave models would probably want a different matrix, in this approach, because how the multiplication is done assumes that rotating the model samples produces phase shift. [ Thinking about this, one can see how the meaning of complex multiplication is angle addition, by seeing how distributed multiplication (a + bi)(c + di) makes a rotation matrix as its expansion. ]
- pull out the wavelet stuff, and keep everything sinusoids - collect notes about why wavelets and step functions would need a different matrix structure the reason to not switch to this structure is slow progress. let's keep moving with the fourier approach.
i've removed the wavelet things i can now make the assertions pass, in theory, if i produce longvec based on a sinusoid signal. i have not done this. this is the content i ended up remembering: #### Notes on wavelets # I naively tried using other functions than sinusoids for a DFT matrix. # Given complex multiplication performs a scale + rotation operation on the model samples, # only cos(t) + sin(t)i works as a coefficient and an underlying wave, without changing # how the waveforms are evaluated, because it performs a rotation over time. # Some other ideas include: # - frequency error/resonance matrix, outputs the error or resonance for each frequency, # without regard to underlying shape # - converting between a sinusoid spectrum and one based on wavelets, in frequency space # - evaluating wavelets forward pass only, and thinking on the result # - other ideas [note: could look up wavelets]
I'm pretty inhibited atm and smaller task feels nice. I don't really know. I'm sure I can return to this eventually. I also have tasks that provide more return than this, but it's pretty rare for me to stabilize work on a task, so I'm likely to consider near this to support that concept.
I have fourier.py with tests for "extracting a repeating wave" that are not passing simply because the wave is sampled very differently from its model 0614 0615 maybe i'll switch back to vis.py; oh that's inhibited too 0616 i actually have a small EEG going. when i try to start working on this i experience sparkles on my skin and other things that show clear miswiring associated with the inhibition. the EEG lets me press a button to indicate an event. i guess i'd like this to be easier than that. also i feel the inhbition changing in response to having some its paths possibly isolatable in a log. 0617 thinking about fourier.py, samples, etc
i've drafted a different interpolator, like before. the test is not passing. my usual approach has been to practice examining the arithmetic by hand. i have a weird inhibition against doing things that could appear unreasonably laborious to others. could make an interesting (and quite useful) practice to practice just that, somehow.
(Pdb) p inserting_spectrum array([ 9.13008935-2.23523530e-15j, -1.13139897-1.72422981e-01j, 0.11932716-6.18590050e-01j, -0.47619445-1.42077247e+00j, 0.60106394-6.17441233e-01j, 0.4321097 -8.80843476e-01j, 0.92222665+1.15314024e+00j, -0.4839133 -2.10053038e-01j, -0.31551473-6.23392187e-16j]) (real domain) this code differs from fourier; i get to figure out why: 330 B inserting_spectrum = randvec @ unextracting_ft 331 -> for idx in range(len(longvec)): 332 longvec[idx] = 0 333 for freq_idx in range(len(inserting_spectrum)): 334 freq = shortspace_freqs[freq_idx] 335 mag = inserting_spectrum[freq_idx] 336 longvec[idx] += np.abs(mag) + np.cos(2 * np.pi * freq * idx / len(randvec) * short_duration + np.angle(mag)) 331 for idx in range(len(longvec)): 332 longvec[idx] = 0 333 for freq_idx in range(len(inserting_spectrum)): 334 freq = shortspace_freqs[freq_idx] 335 mag = inserting_spectrum[freq_idx] 336 -> longvec[idx] += np.abs(mag) + np.cos(2 * np.pi * freq * idx / len(randvec) * short_duration + np. angle(mag)) oops! that 'mag) + np.cos' should be a *! still fails of course. 0329
# this is after all summing (using multiplication by cos instead of addition): (Pdb) p longvec[0] 8.797795339894686 # this is supposed to be the same, and isn't (Pdb) p (inserting_spectrum @ extracting_ift)[0] 0.5488135039273254 0413 (Pdb) complex_shortspace_freqs = fftfreq(complex = True, dc_offset = True, repetition_samples = len(randvec)) (Pdb) p complex_freq2time = create_freq2time(freqs=complex_shortspace_freqs) (Pdb) p complex_freq2time[:,0] array([0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j, 0.0625+0.j]) Since the index is zero, the first sample is the average of all the magnitudes. (Pdb) p extracting_ift[:,0] array([ 0.0625, -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.125 , -0. , 0.0625, -0. ]) extracting_ift is real-domain-only, so it is supposed to store some values doubled. It looks like they cancel and amplify. I made the wrong frequencies. These are the longspace ones. Oops. I should probably check the shortspace ones before the longspace ones. (Pdb) p randvec[0] 0.5488135039273248 Looks like the first value is right; so in theory something about my cosine code is wrong. Here's how it's generated properly: (Pdb) p (inserting_spectrum * extracting_ift[:,0]).sum() 0.5488135039273254 I'm remembering that I'm ignoring the doubling performed by the real-domain transform. That must be the next mistake.
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many