Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] multiplex is nondeterministic #27

Closed
den1k opened this issue Aug 28, 2018 · 6 comments
Closed

[bug] multiplex is nondeterministic #27

den1k opened this issue Aug 28, 2018 · 6 comments

Comments

@den1k
Copy link

den1k commented Aug 28, 2018

From the docs and because we're passing it a vector, I'd expect multiplex to return transformed values in the order of delivery and transformation.

(into [] (x/multiplex [(map inc) (map str)]) (range 10))
=> [1 "0" 2 "1" 3 "2" 4 "3" 5 "4" 6 "5" 7 "6" 8 "7" 9 "8" 10 "9"]
(into [] (x/multiplex [(map inc) (map str)]) (range 10))
=> ["0" 1 "1" 2 "2" 3 "3" 4 "4" 5 "5" 6 "6" 7 "7" 8 "8" 9 "9" 10]

From my observation this only occurs with vector while return values with hash-maps are properly ordered:

(into [] (x/multiplex {:1 (map str) :2 (map identity)}) (range 10))
=> [[:1 "0"]
 [:2 0]
 [:1 "1"]
 [:2 1]
 [:1 "2"]
 [:2 2]
 [:1 "3"]
 [:2 3]
 [:1 "4"]
 [:2 4]
 [:1 "5"]
 [:2 5]
 [:1 "6"]
 [:2 6]
 [:1 "7"]
 [:2 7]
 [:1 "8"]
 [:2 8]
 [:1 "9"]
 [:2 9]]
@cgrand
Copy link
Owner

cgrand commented Aug 29, 2018

Do you have a use case where the deterministic behavior is needed?

@den1k
Copy link
Author

den1k commented Aug 29, 2018

Yes, I was implementing hull-moving-average which combines several weighted-moving-averages. To x/partition them I must know which is which.

Because vectors didn't work I ended up having to use maps which behave deterministically:

(defn factorial
  ([n]
   (assert (integer? n))
   (factorial 0 n))
  ([sum n]
   (if (zero? n)
     sum
     (recur (+ sum n) (dec n)))))

(defn wa
  ([] nil)
  ([^doubles acc]
   (when acc (/ (aget acc 1) (factorial (int (aget acc 0))))))
  ([^doubles acc x]
   (let [acc (or acc (double-array 2))] ; n, sum
     (let [n (inc (aget acc 0))]
       (doto acc
         (aset 0 n)
         (aset 1 (+ (aget acc 1) (* n x))))))))

(def wa-xf (x/reduce wa))

(defn wma-xf [period]
  (x/partition period 1 wa-xf))

(defn hma-xf
  "Hull Moving Average
  - https://github.com/thi-ng/umbrella/blob/master/packages/transducers-stats/src/hma.ts#L20
  - https://www.technicalindicators.net/indicators-technical-analysis/143-hma-hull-moving-average
  - https://www.fidelity.com/learning-center/trading-investing/technical-analysis/technical-indicator-guide/hull-moving-average"
  [period]
  {:pre [(int? period)]}
  (let
   [half-period (-> period (/ 2) int)
    psqrt       (int (Math/sqrt period))
    drop-p      (if (even? period) half-period (inc half-period))]
    (comp
     (x/multiplex {:half (comp
                          (drop drop-p)
                          (wma-xf half-period))
                   :all  (wma-xf period)})
     (x/partition 2)
     (map (fn [[[_ h] [_ a]]] (- (* 2 h) a))) ; <---------------- order important
     (wma-xf psqrt))))

Here's is a typescript based transducer version (the multiplex here acts like a deterministic (comp (multiplex [...n-fns]) (partition n))):
https://github.com/thi-ng/umbrella/blob/master/packages/transducers-stats/src/hma.ts#L20

@cgrand
Copy link
Owner

cgrand commented Aug 29, 2018

I see but even with maps I'm not at ease with:

(map (fn [[[_ h] [_ a]]] (- (* 2 h) a))) ; <---------------- order important

because it relies on the two transducers working in lock steps.

For one input item, a transducer may output any number of items, there's no guarantee on that so multiplex can't generally promise to interleave outputs properly. That's why the two modes are both non-deterministic but the map one tags its output so you can recognize them.

The deterministic order is an implementation detail of clojure small maps:

=> (into [] (x/multiplex (zipmap (range 8) (repeat (map inc)))) (range 2))
[[0 1] [1 1] [2 1] [3 1] [4 1] [5 1] [6 1] [7 1] [0 2] [1 2] [2 2] [3 2] [4 2] [5 2] [6 2] [7 2]]
=> (into [] (x/multiplex (zipmap (range 9) (repeat (map inc)))) (range 2))
[[0 1] [7 1] [1 1] [4 1] [6 1] [3 1] [2 1] [5 1] [8 1] [0 2] [7 2] [1 2] [4 2] [6 2] [3 2] [2 2] [5 2] [8 2]]

Your need is legit but I'm pretty sure that modifying multiplex is the right answer.

You want to take action when each sub transducer has output a value. It sounds a bit like a temporal join. Any idea is welcome

@den1k
Copy link
Author

den1k commented Aug 30, 2018

@cgrand FYI, this is off-topic but related: thi-ng/umbrella#37 (comment)

@aisamu
Copy link

aisamu commented Sep 4, 2018

It hardly qualifies as a use-case, but I also saw that behavior trying to implement Fizzbuzz.

(let [test (fn [div sss]
             (fn [num]
               (when (zero? (mod num div)) sss)))
      chain [(map (test 3 "Fizz")) (map (test 5 "Buzz"))]
      xform (comp (x/multiplex [(comp (x/multiplex chain)
                                      (x/partition (count chain) (x/reduce str))
                                      (map #(if (= "" %) nil %)))
                                identity])
                  (x/partition (count chain) (x/reduce rfs/some)))]
  (eduction xform (range 16)))

The result of running that repeatedly got me very confused:

(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("FizzBuzz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "FizzBuzz")
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
("FizzBuzz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "FizzBuzz")
("BuzzFizz" 1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14 "BuzzFizz")
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)

It makes sense now: multiplex is non-deterministic, but apparently consistent throughout a whole "run". Given that, is this a completely misguided usage?
I tried following your directions from the other issue (#26 -replace it with transjuxt's), but had no luck fitting the required x/some in the above form

@cgrand
Copy link
Owner

cgrand commented Oct 2, 2019

I'm not willing to make the multiplex contract stronger at the moment.

@cgrand cgrand closed this as completed Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants