.. _using:

=======
Roadmap
=======

Buckaroo is maturing and I decided to write a roadmap


Priorities
==========

Buckaroo is still for the most part, pre-users.  It is maturing though, and feedback gathered can map some reasonable principles

* Function as a reliable replacement for the default display of dataframes

  * Exceptions in the basic display of a dataframe are a P1 error.
  * Dataframes that don't display are a P1 error.
  * Taking more than a second to display a dataframe with less than 1M values is a P2 error

* Buckaroo should do the least surprising thing.

  * autocleaning should be turned off by default.

* Bug/feature request priorities

  * This is the roadmap and I'll stick with it.
  * If a user has a feature/bug request, and that is preventing them from using Buckaroo, that gets priority.


Release Plans
=============


0.4 Series
----------


#. Documentation

   * Readme refresh
   * How to create a formatter
   * Pluggable analysis framework refresh
   * Customizing autocleaning
   * Customizing enable/instantiation
   * Order of operations Dataflow doc
#. Promotion
#. Devops improvements (CI, testing, end to end testing, packaging)

   * CI passing - Done
   * CI testing - Done
   * End to End testing - Done
   * CI version Bump - needed
   * Ruff python linter - needed
#. Jupyter notebook compatability

   * Google colab - Done
   * VSCode - Done
   * Warning message on notebook < 7 - Done
   * Notebook 6.0 compatability ???
#. Code cleanup

   * Typescript passes linter - Done
   * snake_case camelCase normalization
   * better naming
   * sub module organization
#. Python Repr bugs

   * List
   * Tuple
   * Nested list and tuple across python types (int, float, boolean)
   * Dictionary?
#. Formatters

   * DateTime formatter
   * Float formatter with specificity
#. Frontend

   * Autoclean toggle


0.5 series
----------
I'm a bit fuzzy on this one, it's either going to be a backend port to polars or filtering.  I'll write it as filtering for now

#. Filtering

   * any field text search
   * Should work with codegen
   * Per column exact filtering
#. Additional sampling techniques

   * Chunks (50 contiguous rows)
   * Outliers - extent percentile for each colum all in a single view
   * Straight random sample
#. UI cycling

   * Everything that is now binary (summary stats on or off), is actually a single choice of multiple possible choices.  Allow multiple clicks to cycle through different options.
   * Enable cycling for summary_stats and sample method
#. Low code UI

   * Add Commands for filtering

0.6 series
----------
Polars backend

All of the same tests should pass.

#. Lowcode UI Commands in polars

   * Gives auto cleaning and filtering at much higher performance. Nice way to dip my feet into polars.
   * Testing that verifies ``eval(_to_py) == transform(df)`` and ``pl.transform(df) == pd.transform(df)``
   * pandas and polars equivalence is key to code gen continuing to be useful
#. Serialization in polars

   * 2x speed bump
   * straight forward
#. Pluggable analysis framework - for polars

   * Same pluggable analysis framework, now lazy
   * Summary stats run on whole dataframe - up to 1Gig

0.7 series
----------

#. serialization speedup

   * integrate parquest_wasm in the frontend
   * parquet serialization on the backend
   * maintain json serialization