Gradient estimation notebook#1549
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
🔥 New notebook just dropped! @amir-naveh , @TomerGoldfriend — come check out this shiny new addition to our repo. |
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
I think it is better to move the utility functions to an adjacent python file, to keep the notebook clean. If there is some specific utility that it is valuable for the reader to read the definition, you can put it along the lines.
Reply via ReviewNB
There was a problem hiding this comment.
- spelling issues according to the claude:
- Cell 6: "Classicaly" → "Classically"
- Cell 6: "in specific point" → "at a specific point"
- Cell 7: "if there are more than one coordinates" → "if there is more than one coordinate"
- Cell 8: "explaination" → "explanation" (appears multiple times)
- Cell 8: "miss some" → "misses some"
- Cell 8: "fill in this details" → "fill in these details"
- Cell 8: "resultion" → "resolution"
- Cell 9: "repetetive" → "repetitive"
- Cell 9: "beggining" → "beginning"
- Cell 18: "complitly" → "completely"
- Cell 25: "unefficient" → "inefficient"
- Cell 40: "explaination" → "explanation"
- Cell 46: "drasticly" → "drastically"
- Cell 59: "sucess" → "success"
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #53. def compute_success_rate(pc, analytic_derivatives, reject_underresolution=False):
I think you can simplify this function using the dataframe results
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Stick to the format under the algorithms directory. See for example simon\deutch-josza. Specifically for the beginning of the notebook - what is the input, output, complexity etc.
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
- \ket is not parsed well in our documentation. switch to |...\rangle instead.
- In step 1 you talked about few coordinates, but the rest of the formulas were only for a single variable
- "if there are more than one coordinates" - fix grammer
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
- Exact version of the algorithm - I think 'exact' is misleading, because you still get an approximation to the gradient. Maybe "Full" version or "General" version is more suitable.
Reply via ReviewNB
There was a problem hiding this comment.
2 - There are several approximations to the gradient - i.e backwards, forwards, symmetric etc. It would be nice to write something about what version you use here, and if it can work with the other versions
There was a problem hiding this comment.
3 - "so doing a similar trick we can define... " - I think that the equation is not correct, or I miss something
There was a problem hiding this comment.
4 - what is the tradeoff in l? why shouldn't one take it to be very small?
It will be good if you can describe the gradient accuracy (in a formula) in terms of the problem parameters
There was a problem hiding this comment.
5 - In general - this explanation is quite cumbersome. Try to think how to simplify it, consider drawing it
There was a problem hiding this comment.
1 - Fixed
2 - I tried to add it, but I feel that it hurts the flow, and doesn't add much. What do you think?
3 - I think it is correct. The paper use the convertion formula delta_masured = N/m * grad(f) (see page 2 in the middle of the left column). The -N/2 make delta a signed value instead of unsigned. I think it is more clear this way (because we don't assume that the gradient is positive). The paper notes about it (page 2 in the end of the left column), but I think it is more confusing that way.
4 - TODO
5 - TODO
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
repetetive -> repetitive (do syntax and grammer check on all of the text)
"Phase Kickback (as in the cited paper) and Prepare State " -> I guess you meant 'direct phase'
Reply via ReviewNB
There was a problem hiding this comment.
2 - "A technical note: " - the comment is not needed
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
I would explain what is the input model for f assumed for phase kickback vs the direct case
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #14. def main(x: Output[QNum[n]], ancilla: Output[QNum[n0, SIGNED, n0]]):
consider not defining ancilla as an output. then only define it locally and clean it after the usage
Reply via ReviewNB
There was a problem hiding this comment.
It caused me some problems. I created a temporary cell (search for "Ancilla not as output"), and for now kept the original cells.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #44. # Pay attention that we use a statevector simulator here
now you can use the new function sample and calculate_state_vector,it will save a lot of boilerplate code
Reply via ReviewNB
There was a problem hiding this comment.
The main reason for the context manager is to use es.set_measured_state_filter("ancilla", lambda v: v == 0).
If I fix the previous problem with the ancilla, I'll be able to get rid of it.
I want to discuss the different ways to execute the code and the best practices, because it confuses me.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #25. numeric_value = -0.5**n0
I think it will be much simpler and robust to have an prepare_phase_gradient_state function that works with a QArray so the type of the qnum will not matter. you can see https://arxiv.org/pdf/2409.04643 page 10.
Reply via ReviewNB
There was a problem hiding this comment.
I used the prepare_probabilities function
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
I think these part is a bit overexplained. I would plot the corrected phase at first place
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
There was a problem hiding this comment.
I agree, but I think that if I write “depending on the function f” it will sound as if it depends on the mathematical function (ie linear, quadratic, etc). I think a better way to write it is “depending on the oracle f” maybe?
Changed to this for now.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #18. prepare_complex_amplitudes(magnitudes=magnitudes, phases=phases, out=x)
this function is not scallable. Why didn't you used the phase function directly? in the case f is a polynomial, it should be quite straight forward. The point is that it is sometimes beneficial to load f directly to the phase and not pass through the digital encoding (which will sometimes requires additional qubits or depth as well).
Reply via ReviewNB
There was a problem hiding this comment.
Fixed in all places, except from the multi-coordinate case - TODO this one.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
There was a problem hiding this comment.
I think it is beneficial to show that the linear approximation does not happen in this step, only in the next step (the QFT).
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #6. def main(x: Output[QNum[n]]):
why didn't you define x as a signed integer? then you save the ugly post processing
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #23. majority_state = pc[0].state
1 - I am not sure whether the results are sorted always. Better to work with the dataframe and sort it
2 - Consider plotting a graph with the parsed gradient on the x axis and the probability on the y axis, and a dotted vertical line on the analytical gradient. (then do it for the next examples as well)
Reply via ReviewNB
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2289 @@ | |||
| { | |||
There was a problem hiding this comment.
I am a bit confused about the m and l parameters.
I would work with the problem parameters |grad(f)| (maximal size of the gradient) and epsilon (required accuracy for the result). Say you know both of them, how would you choose l, m and the number of qubits? Write it in an equation. Assuming that you have an accurate phase oracle. Then you can also consider the case you do the calculation with a digital oracle (phase kickback) then you have additional source of error, and what will be the required number of qubits for this calculation (not a must though).
It seemed to me in your description that the choices are not coupled. They should all be derived from the problem parameters eps and |grad|
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
- We will explore quantum algorithm --> a quantum algorithm.
- "Output: ..." --> "... at the origin encoded on a quantum variable of size n".
- Can you add some keywords? see how it is done in other notebooks in the Algorithms directory.
Add a References section at the end and refer in the intro to the original paper.
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
In your figure and equations it seems like the separation in the x axis is 1. But you use the approximation f(x)=f(0)+x \grad f(0), which is true for small x.
Try to combine the two figures somehow... maybe side by side. Or even two curves on the same plot, but with some arbitrary shift between them.
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
I think N_0 is not defined yet. You can mention that for some scenarios we have more parameters to tune. Or even write explicitly what scenario, if it takes few words to indicate it. Another option is not to mention this parameter now.
- "Let's define: " --> We assume the values of the following parameters (or some approximation of them) are given:
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
We use to talk about three different encodings of expression. Phase encoding e^{if(x)}|x>, Amplitude encoding f(x)|x> (sometimes we need post-selection/ancilla), and Digital encoding |f(x)>.
If I understood correctly, here you have as an input a digital encoding, and you apply phase-kick-back to transform it to a phase encoding, right?
If yes, then mention it: "The first step of the algorithm is to prepare the state: ..., given the oracle |x>--> |f(x)>
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
Show the mathematics here (or in an appendix if this is too long) on why this gives a phase kick-back.
I think you should add that we need to post-select on the ancilla being zero.
Reply via ReviewNB
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #3. p = params()
We typically try to avoid using helper python files. It depends how heavy is the calculation and critical for understanding the idea of the algorithm. I do not know what is the situation here... so it is just a thought for now.
Reply via ReviewNB
There was a problem hiding this comment.
Originally I had the helpers in the top of the file, and Or suggested to move them to a separate file.
I use helpers in many places to avoid repetitive 'non interesting' code, but I can write the code explicitly if it makes more sense. Let's discuss it in person.
There was a problem hiding this comment.
OK, yes, keep it as is and let's discuss.
| @@ -0,0 +1,2122 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #24. prepare_state(probabilities=probabilities, bound=0.01, out=ancilla)
- why did you use a bound of 0.01?
- You can do:
allocate(ancilla) ancilla ^= -1/N_0 # or prepare_basis_state([1]*N_0, ancilla)
- The variable name ancilla is confusing here, because it typically means this is a qubit which returns to zero.
- Use the new
calculate_state_vectorinstead of the ExecutionSession. - Try to sort the dataframe, the result will be more visible I think.
Reply via ReviewNB
There was a problem hiding this comment.
- I didnt like ^= -1/N_0 because it assumes that the ancilla is signed fractional, and if it's not it won't work. I changed to prepare_basis_state.
- The variable name ancilla is confusing here, because it typically means this is a qubit which returns to zero.
As you also wrote before, we can do post selection where the ancilla is zero, and we can also do the inverse QFT and return it to be explicitly zero. I didn't do it because I felt it isn't necessary and just waste of resources, but if it is more accurate I can do it. The name ancilla is used in the paper.
3. I really liked the calculate_state_vector :)
4. Done
There was a problem hiding this comment.
2 . hmmm so you can uncompute the ancilla with inverse QFT to the ancilla. Interesting... OK, let's discuss this in person as well.
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
What if we want not at the origin, but at some other point x_0?
It should be written somewhere that the idea of the algorithm is to first encode the function in the phase, and that the result is an inplace encoding of the gradient. The idea of inplace shall be mentioned.
Reply via ReviewNB
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
Can we avoid the dataframe warning somehow (not as an output, but really do not raise it)?
Reply via ReviewNB
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
You have an output of a runtime error, why is it? can we avoid it (not as an output, but really not raising it)
Reply via ReviewNB
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
Assuming that we can simplify the code of the second 2d example (see comment below), I think we can remove the remove the first, simple usecase, and stay only with the second one.
Reply via ReviewNB
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #20.
I think it is possible to work with a QSruct here, or a QArray[QNum[n, SIGNED, 0]],2]. Let's discuss this in person.
Reply via ReviewNB
| @@ -0,0 +1,2180 @@ | |||
| { | |||
There was a problem hiding this comment.
| N: int, | ||
| x0: float = 0.0, | ||
| d: int = 1, | ||
| ) -> Callable: |
| @@ -0,0 +1,1989 @@ | |||
| { | |||
There was a problem hiding this comment.
About the plot, if you like, you can put the first marker as "o" and the second one as ".", and it shows nicely the fit.
Reply via ReviewNB
| @@ -0,0 +1,1989 @@ | |||
| { | |||
There was a problem hiding this comment.
Write that you will define now the quantum function that calculate the gradient, for 1d and for multi-d cases.
Reply via ReviewNB
| @@ -0,0 +1,1989 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #14. @qfunc
The two qfuncs have the same docstring, no?
Write that the second one is for multi-dimensional.
(BTW, does the algorithm works for different discretization on each axis?)
Reply via ReviewNB
There was a problem hiding this comment.
It works. The current implementation for phase_oracle uses the same l for all coordinates, but it is just for convenience, and it will work for arbitrary l in each axis.
No description provided.