Compute histogram bin counts for specified variables in baseline and target data for drift detection (2024)

Compute histogram bin counts for specified variables in baseline and target data for drift detection

Since R2022a

collapse all in page

    Syntax

    H = histcounts(DDiagnostics)

    H = histcounts(DDiagnostics,Variables=variables)

    Description

    example

    H = histcounts(DDiagnostics) returns the histogram bin counts in the table H for all variables specified for drift detection in the call to the detectdrift function.

    example

    H = histcounts(DDiagnostics,Variables=variables) returns the bin counts for the variables specified by variables.

    Examples

    collapse all

    Compute Histogram Bin Counts for All Variables

    Open Live Script

    Generate baseline and target data with two variables, where the distribution parameters of the second variable change for target data.

    rng('default') % For reproducibilitybaseline = [normrnd(0,1,100,1),wblrnd(1.1,1,100,1)];target = [normrnd(0,1,100,1),wblrnd(1.2,2,100,1)];

    Perform permutation testing for any drift between the baseline and target data.

    Compute the histogram bin counts for all variables.

    H = histcounts(DDiagnostics)
    H=2×3 table Bins Counts_Baseline Counts_Target __________________________________________________________________________________ ____________________________________________ __________________________________________ x1 {[-3.5000 -3 -2.5000 -2 -1.5000 -1 -0.5000 0 0.5000 1 1.5000 2 2.5000 3 3.5000 4]} {[0 1 1 3 14.0000 11 17 17 15 11 5 1 2 1 1]} {[1 0 2 6 7.0000 13 22 24 11 8 4 2 0 0 0]} x2 {[ 0 0.5000 1 1.5000 2 2.5000 3 3.5000 4 4.5000 5 5.5000 6]} {[ 33 23 14.0000 11 8 6 3 0 0 1 0 1]} {[ 13 32 29.0000 20 6 0 0 0 0 0 0 0]}

    H is a table with three columns. histcounts divides the data into bins and computes the histogram bin counts for a variable in the baseline and target data over the common bins. The first and second rows contain the bins and counts for variables x1 and x2, respectively.

    Access the histogram bin counts in the baseline data for the first variable.

    H.Counts_Baseline{1}
    ans = 1×15 0 1.0000 1.0000 3.0000 14.0000 11.0000 17.0000 17.0000 15.0000 11.0000 5.0000 1.0000 2.0000 1.0000 1.0000

    Plot the probability density function (pdf) estimate (percent of the data in each bin) of the baseline data for variable 1.

    histogram(BinEdges=H.Bins{1},BinCounts=H.Counts_Baseline{1},Normalization='probability')

    Compute histogram bin counts for specified variables in baseline and target data fordrift detection (1)

    You can also plot the histogram of the baseline and target data for variable 1 using the plotHistogram function.

    plotHistogram(DDiagnostics,Variable=1)

    Compute histogram bin counts for specified variables in baseline and target data fordrift detection (2)

    Compute Histogram Bin Counts for Specific Variables

    Open Live Script

    Load the sample data.

    load humanactivity

    For details on the data set, enter Description at the command line.

    Assign the first 1000 observations as baseline data and the next 1000 as target data.

    baseline = feat(1:1000,:);target = feat(1001:2000,:);

    Test for drift on all variables.

    DDiagnostics = detectdrift(baseline,target);

    Compute the histogram bin counts for only the first five variables.

    H = histcounts(DDiagnostics,Variables=(1:5))
    H=5×3 table Bins Counts_Baseline Counts_Target _________________________________________________________________________________________________________________________________________________ ______________________________________________________________________ ____________________________________________________________________________________________________ x1 {[ -0.2000 -0.1000 0 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000]} {[ 0 0 0 0 0 0 0 0 0 85.9000 14.1000]} {[ 12.4000 76.6000 2.1000 0 0 0 0.1000 0.1000 0.1000 0.1000 8.5000]} x2 {[ -0.3000 -0.2000 -0.1000 0 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1 1.1000]} {[ 0 0 0 0 0 9.9000 24 0.3000 65.8000 0 0 0 0 0]} {[ 0.1000 0 0.1000 0.1000 0.1000 8.2000 0.3000 0 0 0 0 0 53.8000 37.3000]} x3 {[ -0.6000 -0.5500 -0.5000 -0.4500 -0.4000 -0.3500 -0.3000 -0.2500 -0.2000 -0.1500 -0.1000 -0.0500 0 0.0500 0.1000 0.1500 0.2000 0.2500]} {[0 19.9000 13.6000 0.3000 0.3000 0.2000 65.7000 0 0 0 0 0 0 0 0 0 0]} {[0.1000 0.4000 8.4000 0 0 0 0 0 12.9000 4.1000 0.3000 0.2000 0.4000 8.5000 49.1000 2.7000 12.9000]} x4 {[0 0.0100 0.0200 0.0300 0.0400 0.0500 0.0600 0.0700 0.0800 0.0900 0.1000 0.1100 0.1200 0.1300 0.1400 0.1500 0.1600 0.1700 0.1800 0.1900 0.2000]} {[ 0 0 0 0 0 0 0 0 0 0 65.6000 33.9000 0.4000 0.1000 0 0 0 0 0 0]} {[ 34.5000 55.7000 0.9000 0 0 0 0 0 0 0 0 7.4000 0.5000 0.2000 0.3000 0 0.1000 0.3000 0 0.1000]} x5 {[ 0.0300 0.0400 0.0500 0.0600 0.0700 0.0800 0.0900 0.1000 0.1100 0.1200 0.1300 0.1400 0.1500 0.1600 0.1700]} {[ 0.3000 33.1000 0 0 0.3000 66 0.3000 0 0 0 0 0 0 0]} {[ 0 7.5000 0.5000 0.1000 0 0 0 0.1000 0.1000 0 0.2000 91.1000 0.2000 0.2000]}

    Access the histogram bin counts for the second variable in the target data.

    H.Counts_Target{2}
    ans = 1×14 0.1000 0 0.1000 0.1000 0.1000 8.2000 0.3000 0 0 0 0 0 53.8000 37.3000

    Input Arguments

    collapse all

    DDiagnosticsDiagnostics of permutation testing for drift detection
    DriftDiagnostics object

    Diagnostics of the permutation testing for drift detection, specified as a DriftDiagnostics object returned by detectdrift.

    variablesList of variables
    string array | cell array of character vectors | integer indices

    List of variables for which to compute the histogram bin counts, specified as a string array, cell array of character vectors, or list of integer indices.

    Example: Variables=["x1","x3"]

    Example: Variables=(1,3)

    Data Types: single | double | char | string

    Output Arguments

    collapse all

    H — Histogram bin counts
    table

    Histogram bin counts, returned as a table with the following columns.

    Column NameDescription
    Bins

    Common domain over which to evaluate the histogram bin counts for a variable.

    • For categorical variables, Bins contains the categories.

    • For continuous variables, Bins contains the bin edges.

    Counts_BaselineHistogram bin counts for the corresponding variables in the baseline data
    Counts_TargetHistogram bin counts for the corresponding variables in the target data

    For each variable in H, the columns contain the bins and counts in cell arrays. To access the counts, you can index into the table; for example, to obtain the histogram bin counts for the second variable in the baseline data, use H.Counts_Baseline{2,1}.

    Algorithms

    • For categorical data, detectdrift adds a 0.5 correction factor to the histogram bin counts for each bin to handle empty bins (categories). This is equivalent to the assumption that the parameter p, probability that value of the variable would be in that category, has the prior distribution Beta(0.5,0.5), (Jeffreys prior assumption for the distribution parameter).

    • histcounts treats a variable as ordinal for visualization purposes in these cases:

      • The variable is ordinal in either the baseline data or the target data, and the categories from both the baseline data and the target data are the same.

      • The variable is ordinal in either the baseline data or the target data, and the categories of the other data set are a subset of the ordinal data.

      • The variable is ordinal in both the baseline data and the target data, and categories from either data set are a subset of the other.

    • If a variable is ordinal, histcounts preserves the order of the bin names.

    Version History

    Introduced in R2022a

    See Also

    detectdrift | DriftDiagnostics | plotDriftStatus | plotEmpiricalCDF | plotHistogram | plotPermutationResults | ecdf | summary

    Commande MATLAB

    Vous avez cliqué sur un lien qui correspond à cette commande MATLAB:

     

    Pour exécuter la commande, saisissez-la dans la fenêtre de commande de MATLAB. Les navigateurs web ne supportent pas les commandes MATLAB.

    Compute histogram bin counts for specified variables in baseline and target data fordrift detection (3)

    Select a Web Site

    Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

    You can also select a web site from the following list:

    Americas

    • América Latina (Español)
    • Canada (English)
    • United States (English)

    Europe

    • Belgium (English)
    • Denmark (English)
    • Deutschland (Deutsch)
    • España (Español)
    • Finland (English)
    • France (Français)
    • Ireland (English)
    • Italia (Italiano)
    • Luxembourg (English)
    • Netherlands (English)
    • Norway (English)
    • Österreich (Deutsch)
    • Portugal (English)
    • Sweden (English)
    • Switzerland
      • Deutsch
      • English
      • Français
    • United Kingdom (English)

    Asia Pacific

    • Australia (English)
    • India (English)
    • New Zealand (English)
    • 中国
    • 日本 (日本語)
    • 한국 (한국어)

    Contact your local office

    Compute histogram bin counts for specified variables in baseline and target data for
drift detection (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Dan Stracke

    Last Updated:

    Views: 6511

    Rating: 4.2 / 5 (43 voted)

    Reviews: 82% of readers found this page helpful

    Author information

    Name: Dan Stracke

    Birthday: 1992-08-25

    Address: 2253 Brown Springs, East Alla, OH 38634-0309

    Phone: +398735162064

    Job: Investor Government Associate

    Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

    Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.