Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast build ellipse model #1288

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

HuanLe02
Copy link

Hello! This is a pull request in response to #1168 for a faster build_ellipse_model() function, using C and OpenMP. Changes are

  • model.py:
    (1) changed data structure for result and weight from 2D numpy array to a 1D C-native array
    (2) converted all auxiliary numpy arrays (intens_array, eps_array, etc) into C arrays
    (3) ported the for loop into C code
  • tests/test_model.py:
    (1) Add new test for a multithreaded version of the function
    (2) Add new test for processing a large array (5000x5000), shows that result is returned very fast
  • worker.so: compiled file from C code

Here is the C code for the loop (worker.c):

#include <omp.h>
#include <math.h>
#include <stdlib.h>

#define PI 3.1415926535897932384626433

// MAX FUNCTION
static double max(double a, double b)
{
  if (a > b) return a;
  else return b;
}

// radius function, obtained from code of EllipseGeomtry
// https://photutils.readthedocs.io/en/stable/_modules/photutils/isophote/geometry.html#EllipseGeometry
static double radius(double angle, double sma, double eps)
{
  double val = (sma * (1. - eps) /
                sqrt(pow(((1. - eps) * cos(angle)),2) +
                        pow((sin(angle)),2)));
  return val;
}

// WORKER FUNCTION
void worker(double* result, double* weight, int n_rows, int n_cols, int N, int high_harmonics,
		double* fss_arr, double* intens_arr, double* eps_arr, double* pa_arr,
		double* x0_arr, double* y0_arr, double* a3_arr, double* b3_arr, double* a4_arr, double* b4_arr,
		int nthreads)
{
  int i;
  int n_entries = n_rows * n_cols;

  omp_set_num_threads(nthreads);

  #pragma omp parallel default(shared) private(i)
  {
    // private arrays for each thread to write, calloc auto initializes to 0.0
    double* result_priv = calloc(n_entries, sizeof(double));
    double* weight_priv = calloc(n_entries, sizeof(double));

    #pragma omp for schedule(static) nowait
    for (i = 1; i < N; i++) {
      // variables, obtained at array[index]
      double sma0 = fss_arr[i];
      double intens = intens_arr[i];
      double eps = eps_arr[i];
      double pa = pa_arr[i];
      double x0 = x0_arr[i];
      double y0 = y0_arr[i];

      // Ellipse geometry, _phi_min (defined in original code as 0.05)
      // https://photutils.readthedocs.io/en/stable/_modules/photutils/isophote/geometry.html#EllipseGeometry
      double _phi_min = 0.05;

      // scan angle
      double r = sma0;
      double phi = 0.0;
      while (phi <= 2*PI + _phi_min) {
        double harm = 0.0;
        if (high_harmonics) {
          harm = (a3_arr[i] * sin(3.0*phi) +
          	b3_arr[i] * cos(3.0*phi) +
          	a4_arr[i] * sin(4.0*phi) +
          	b4_arr[i] * cos(4.0*phi)) / 4.0;
        }
        double x = r * cos(phi + pa) + x0;
        double y = r * sin(phi + pa) + y0;
        int i = (int) x;
        int j = (int) y;
        double fx, fy;

        if (i > 0 && i < n_cols-1 && j > 0 && j < n_rows-1) {
          // fractional deviations
          fx = x - i;
          fy = y - j;

          // transform 2D index to 1D index on a flattened array
          // ex: j_i = [j,i], j1_i = [j+1,i]
          int j_i = i + j*n_cols;
          int j_i1 = (i+1) + j*n_cols;
          int j1_i = i + (j+1)*n_cols;
          int j1_i1 = (i+1) + (j+1)*n_cols;

          // isophote contribution
          result_priv[j_i] += (intens + harm) * (1.0 - fy) * (1.0 - fx);
          result_priv[j_i1] += (intens + harm) * (1.0 - fy) * fx;
          result_priv[j1_i] += (intens + harm) * fy * (1.0 - fx);
          result_priv[j1_i1] += (intens + harm) * fy * fx;

          // fractional area contribution
          weight_priv[j_i] += (1.0 - fy) * (1.0 - fx);
          weight_priv[j_i1] += (1.0 - fy) * fx;
          weight_priv[j1_i] += fy * (1.0 - fx);
          weight_priv[j1_i1] += fy * fx;

          // step next pixel on ellipse
          phi = max((phi + 0.75/r), _phi_min);
          r = max(radius(phi, sma0, eps), 0.5);
        }
        else {
          break;
        }
      }
    }

    // write to main array, one thread at a time
    #pragma omp critical
    {
      for (int ct = 0; ct < n_entries; ct++) {
        result[ct] += result_priv[ct];
	weight[ct] += weight_priv[ct];
      }
    }

    // free memory
    free(result_priv);
    free(weight_priv);
  }  
}

@pep8speaks
Copy link

Hello @huancho9802! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 35:5: W191 indentation contains tabs
Line 35:5: E101 indentation contains mixed spaces and tabs
Line 91:1: W293 blank line contains whitespace
Line 109:1: W293 blank line contains whitespace
Line 124:1: W293 blank line contains whitespace
Line 125:6: E114 indentation is not a multiple of four (comment)
Line 125:6: E116 unexpected indentation (comment)
Line 136:1: W293 blank line contains whitespace
Line 137:18: W291 trailing whitespace

Line 47:1: W293 blank line contains whitespace
Line 48:1: E302 expected 2 blank lines, found 1
Line 65:1: W293 blank line contains whitespace
Line 94:1: E302 expected 2 blank lines, found 1
Line 109:1: W293 blank line contains whitespace

@HuanLe02
Copy link
Author

HuanLe02 commented Feb 1, 2022

It seems like the ReadTheDocs build failed because of my worker.so file (OSError: shared object file not found). Can you help me with packaging the .so file? This is my first time doing this so I really appreciate any help. Thanks!

@larrybradley
Copy link
Member

Thanks, @huancho9802. @ibusko, can you help with this PR?

@ibusko
Copy link
Contributor

ibusko commented Feb 7, 2022

@larrybradley Unfortunately, I don't have much experience with CI, especially with packaging C libraries for ReadTheDocs. It would take me a while to get into that, and I am currently booked solid with JETC work. Next week I go back to work with the Indigo team, maybe we can carve out some time them?

@perwin
Copy link

perwin commented Feb 8, 2022

It seems like the ReadTheDocs build failed because of my worker.so file (OSError: shared object file not found). Can you help me with packaging the .so file? This is my first time doing this so I really appreciate any help. Thanks!

My memory from a couple of years ago dealing with a (possibly) similar issue is that ReadTheDocs simply won't install Python extension modules (i.e., modules based on compiled C/C++ code).

The problem I had was that I had Sphinx generating API-based HTML documentation using Sphinx autodoc commands; this would fail on ReadTheDocs for any module that tried to import the extension module, because the latter wasn't present in the ReadTheDocs installation.

(I eventually figured out a kludge where I would generate documentation normally on my laptop, and then repackage the API-based HTML output into "raw HTML" that could be used by Sphinx on ReadTheDocs in place of the autodoc-related commands.)

@HuanLe02
Copy link
Author

Hello all! Thank you so much for taking a look at my PR. I'm aware that the auto build tools fail to build C code and recognize the binary *.so file. I notice that a lot of the C-extensions in the package are written in Cython, so should I migrate my function to Cython instead of pure C?

If yes then I will definitely take a look at Cython in the future, though I'm in school right now so it may take a while. For the mean time, you can take a look at the demo at https://github.com/huancho9802/fast-build-ellipse-demo and to see the speedup compared to pure Python code.

Thanks!

isolist : `~photutils.isophote.IsophoteList` instance
The isophote list created by the `~photutils.isophote.Ellipse`
class.

nthreads: float, optiomal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nthreads: float, optiomal
nthreads: float, optional

isolist : `~photutils.isophote.IsophoteList` instance
The isophote list created by the `~photutils.isophote.Ellipse`
class.

nthreads: float, optiomal
Number of threads to perform work. Default is 1 (serial code).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Number of threads to perform work. Default is 1 (serial code).
Number of threads to execute the task. Default is 1 (serial code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants