Dataframe Helper Methods

Data and dataframe helper class for the module Core

Hint

can be directly called from core as core.df

basic usage:

from ozcore import core
core.df.search(df, q="something")
class ozcore.core.data.dataframe.Dataframe[source]

helper methods for dataframe operations

update_a_df_column(df_to_update: pandas.core.frame.DataFrame, df_as_source: pandas.core.frame.DataFrame, unique_col: str, col_to_update: str)[source]

Updates a Dataframe column with a source Dataframe based on their common unique columns

Parameters
  • df_to_update – dataframe, main df to be updated

  • df_as_source – dataframe, source df to update the main df

  • unique_col – str, common columns (should have same name) to match records, this unique column must have unique values

  • col_to_update – str, which column value to be updated

Returns

a copy of the updated DataFrame

Warning

index is reset during the update

pngTable(df: pandas.core.frame.DataFrame, colwidth_factor: float = 0.2, fontsize: int = 12, formatFloats: bool = True, save: bool = False, in_folder: Optional[pathlib.PosixPath] = None)[source]

Displays or saves a table as png. Uses matplotlib => pandas plotting table.

Parameters
  • df – dataframe or pivot table

  • colwidth_factor – float, default 0.20, defines the width of columns

  • fontsize – int, default 12

  • formatFloats – bool, default True, formats as two digit prettiy floats

  • save – saves the png file as table.png

  • in_folder – posixpath, default None, folder to save the png file

Returns

png file in Downloads folder

compare_two_df(df_1, df_2, col_to_compare: str, side='both')[source]

Compares two dataframes based on a given column, aka given common Series

Warning

This comparison is only checking the identical values in a Series. Other columns may not match.

Parameters
  • df_1 – dataframe 1

  • df_2 – dataframe 2

  • col_to_compare (str) – column to make the comparison, which is common

  • side – str, default both, options: left, right

Returns

  • a dataframe with diffrences of df_1 from df_2

  • empty if all match

add_a_col_from_a_df(into_df: pandas.core.frame.DataFrame, from_df: pandas.core.frame.DataFrame, unique_col: str, col_to_add: str)[source]

Add a column into a dataframe from another dataframe

Parameters
  • into_df – dataframe, main df, which will be updated with a new column

  • from_df – dataframe, source df, which has the column to add into main df

  • unique_col – str, column name which is common in both dataframes

  • col_to_add – str, column to be added from source dataframe

Returns

  • main dataframe filled with the new column and values, where unique column matches

Warning

this method assumes no index

search(df_to_search, q, columns=None)[source]

Search all or any column of a dataframe, where columns having str (type: object)

Parameters
  • df_to_search – dataframe to be searched

  • q – str, query term

  • columns – str | list, default None, columns to search, if None all columns

Returns

a dataframe with found records

Note

index columns are not included.