Skip to content
This repository was archived by the owner on May 21, 2022. It is now read-only.

Propose some changes to DataFrames.jl API #68

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

[compat]
DataFrames = "0.19, 0.20, 0.21, 0.22, 1"
DataFrames = "0.22, 1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.22 introduces only

LearnBase = "0.2, 0.3"
MLDataPattern = "0.5"
MLLabelUtils = "0.4, 0.5"
Expand Down
9 changes: 6 additions & 3 deletions src/datapattern.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ LearnBase.getobs(dt::AbstractDataFrame, idx) = dt[idx,:]

LearnBase.nobs(dt::DataFrameRow) = 1 # it is a observation
function LearnBase.getobs(dt::DataFrameRow, idx)
idx == 1:1 || throw(ArgumentError(
only(idx) == 1 || throw(ArgumentError(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that idx can be any collection as long as it has a single element equal to 1. Right?

"Attempting to read multiple rows ($idx) with a single row"))

return dt
Expand All @@ -23,9 +23,12 @@ LearnBase.gettarget(::typeof(identity), dt::DataFrameRow) =
_throw_table_error()

# convenience syntax to allow column name
LearnBase.gettarget(col::Symbol, dt::AbstractDataFrame) = dt[1, col]
LearnBase.gettarget(col::Symbol, dt::AbstractDataFrame) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that for this to work we should make sure that the data frame has only one row. Right?

Also the question is why do we only allow Symbol for col. Also integer or string index is allowed in DataFrames.jl.

LearnBase.gettarget(col, only(dt))
LearnBase.gettarget(col::Symbol, dt::DataFrameRow) = dt[col]
LearnBase.gettarget(fun, dt::AbstractDataFrame) = fun(dt)
LearnBase.gettarget(fun, dt::AbstractDataFrame) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the intention correctly the fun form should also normally take a DataFrameRow as an argument. Right? This change is breaking (and I might have misunderstood the intention here).

LearnBase.gettarget(col, only(dt))
LearnBase.gettarget(fun, dt::DataFrameRow) = fun(dt)

# avoid copy when target extraction function is supplied
MLDataPattern.getobs_targetfun(dt::AbstractDataFrame) = dt
Expand Down