oh :D I need to transpose the matrix when passing it to np.linalg.solve , because np.linalg.solve does Ax = b right-to-left, not xA = b left-to-right like I have been doing. 0857 . Now it finishes and produces the exact random data, and it's just like the fourier.py test . The original test I had written had a smaller waveform size than recording size (16 vs 256). If I want to process the recording all at once, this could mean a rectangular matrix. Maybe I'll just try doing that, and see how it goes. 0900 I bumped the recording size back to 256, and changed np.linalg.solve to np.linalg.lstsq and made no other changes, and got this: (Pdb) cont Traceback (most recent call last): File "/usr/local/lib/python3.10/pdb.py", line 1726, in main pdb._runscript(mainpyfile) File "/usr/local/lib/python3.10/pdb.py", line 1586, in _runscript self.run(statement) File "/usr/local/lib/python3.10/bdb.py", line 597, in run exec(cmd, globals, locals) File "<string>", line 1, in <module> File "/shared/src/scratch/test2_upsamplish.py", line 22, in <module> assert np.allclose(np.fft.fft(waveform) @ waveform_freq_2_recording_time, recording) File "<__array_function__ internals>", line 200, in allclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2270, in allclose res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)) File "<__array_function__ internals>", line 200, in isclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2380, in isclose return within_tol(x, y, atol, rtol) File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2361, in within_tol return less_equal(abs(x-y), atol + rtol * abs(y)) ValueError: operands could not be broadcast together with shapes (16,) (256,) It's interesting, it looks like the failure is inside numpy. It's actually inside the allclose call. I guess I can figure this out. (Pdb) list 17 recording = sample_sinusoids_funny(waveform, sample_idcs, max_period) 18 19 recording_rate = 1 20 waveform_rate = (waveform_N / max_period) * recording_rate 21 waveform_freq_2_recording_time = fourier.create_freq2time(waveform_rate, recording_rate, waveform_N, recording_N) 22 -> assert np.allclose(np.fft.fft(waveform) @ waveform_freq_2_recording_time, recording) 23 waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) 24 waveform_freq_np = np.fft.fft(waveform) 25 assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) 26 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) 27 assert np.allclose(waveform, waveform_reconstructed) (Pdb) p recording.shape (256,) (Pdb) p (np.fft.fft(waveform) @ waveform_freq_2_recording_time).shape (16,) Looks like I have my dimensions sideways when they are not equal, in fourier.py . (Pdb) p waveform_freq_2_recording_time.shape (16, 16) No ... looks like I am using a frequency dimension where I should be using a time dimension. On to fourier.py ! 0903 # AGPL-3 Karl Semich 2022 import numpy as np def create_freq2time(freq_rate, time_rate, freq_count, time_count): freqs = np.fft.fftfreq(freq_count) offsets = np.arange(freq_count) * freq_rate / time_rate mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) return mat / freq_count # scaled to match numpy convention It's making a square matrix, and I want a rectangular one. Here it is: offsets = np.arange(freq_count) * freq_rate / time_rate I'll need as many sample offsets as there are samples in the time domain, to generate them. 0904 That change passed that assertion, and now it is failing with np.lstsq . This might mean I get to engage more theory to do it better! (Pdb) cont /shared/src/scratch/test2_upsamplish.py:23: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`. waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) Traceback (most recent call last): File "/usr/local/lib/python3.10/pdb.py", line 1726, in main pdb._runscript(mainpyfile) File "/usr/local/lib/python3.10/pdb.py", line 1586, in _runscript self.run(statement) File "/usr/local/lib/python3.10/bdb.py", line 597, in run exec(cmd, globals, locals) File "<string>", line 1, in <module> File "/shared/src/scratch/test2_upsamplish.py", line 25, in <module> assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) File "<__array_function__ internals>", line 200, in allclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2270, in allclose res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)) File "<__array_function__ internals>", line 200, in isclose File "/home/user/.local/lib/python3.10/site-packages/numpy/core/numeric.py", line 2363, in isclose x = asanyarray(a) ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part. I don't really know what this failure means. 0906 0907 oh, it's actually failing within another allclose call. the lstsq call completed successfully. 23 waveform_freq_reconstructed = np.linalg.lstsq(waveform_freq_2_recording_time.T, recording) 24 waveform_freq_np = np.fft.fft(waveform) 25 -> assert np.allclose(waveform_freq_reconstructed, waveform_freq_np) 26 waveform_reconstructed = np.fft.ifft(waveform_freq_reconstructed) (Pdb) p waveform_freq_reconstructed.shape *** AttributeError: 'tuple' object has no attribute 'shape' (Pdb) p waveform_freq_reconstructed (array([ 9.13008935-2.05511125e-15j, -1.13139897-1.72422981e-01j, 0.11932716-6.18590050e-01j, -0.47619445-1.42077247e+00j, 0.60106394-6.17441233e-01j, 0.4321097 -8.80843476e-01j, 0.92222665+1.15314024e+00j, -0.4839133 -2.10053038e-01j, -0.31551473-6.30467900e-14j, -0.4839133 +2.10053038e-01j, 0.92222665-1.15314024e+00j, 0.4321097 +8.80843476e-01j, 0.60106394+6.17441233e-01j, -0.47619445+1.42077247e+00j, 0.11932716+6.18590050e-01j, -1.13139897+1.72422981e-01j]), array([2.16596661e-24]), 16, array([1.03318085, 1.03318084, 1.03317917, 1.03309466, 1.03124288, 1.01732177, 0.99161519, 0.98123564, 0.9801968 , 0.98016181, 0.98016142, 0.98016142, 0.98016142, 0.98016142, 0.98016142, 0.98016142])) The frequencies look correct, but it's returning a tuple rather than a matrix. It looks like it's showing that the need to process the return value of lstsq correctly. It's exciting there's more information than just the least squares solution. As I was adding it, I was wondering how I would discern error efficiently.
help(np.linalg.lstsq)
Returns ------- x : {(N,), (N, K)} ndarray Least-squares solution. If `b` is two-dimensional, the solutions are in the `K` columns of `x`. residuals : {(1,), (K,), (0,)} ndarray Sums of squared residuals: Squared Euclidean 2-norm for each column in ``b - a @ x``. If the rank of `a` is < N or M <= N, this is an empty array. If `b` is 1-dimensional, this is a (1,) shape array. Otherwise the shape is (K,). rank : int Rank of matrix `a`. s : (min(M, N),) ndarray Singular values of `a`. It returns x, residuals, rank, s. So it found the right solution, the error is 2e-24 which is very low, there are 16 data elements, and it also gives 16 singular values. I'm not sure what a singular value is, although I think I learned once, but it sounds a little important. What's immediately interesting is the solution and the error. In a larger work, the error could indicate background signals that have not been modeled, maybe. draft saved at 0911 am. It works now, using np.lstsq to exactly find the 16 random samples repeated within a 256 sample long recording. A further avenue might be to add background noise and see how the noise magnitude relates to the recording duration needed to completely remove the noise, and the error reported by the solution. Ideally you'd figure this in theory, and then verify by testing. A different further avenue might be to do it while blind to the repeating frequency of the waveform in the data. (note: it is the only signal here). Attached is fourier.py with the dimension bug fixed.