-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce adjacentPairs
#119
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# AdjacentPairs | ||
|
||
[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/AdjacentPairs.swift) | | ||
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/AdjacentPairsTests.swift)] | ||
|
||
Lazily iterates over tuples of adjacent elements. | ||
|
||
This operation is available for any sequence by calling the `adjacentPairs()` method. | ||
|
||
```swift | ||
let numbers = (1...5) | ||
let pairs = numbers.adjacentPairs() | ||
// Array(pairs) == [(1, 2), (2, 3), (3, 4), (4, 5)] | ||
``` | ||
|
||
## Detailed Design | ||
|
||
The `adjacentPairs()` method is declared as a `Sequence` extension returning `AdjacentPairsSequence` and as a `Collection` extension returning `AdjacentPairsCollection`. | ||
|
||
```swift | ||
extension Sequence { | ||
public func adjacentPairs() -> AdjacentPairsSequence<Self> | ||
} | ||
``` | ||
|
||
```swift | ||
extension Collection { | ||
public func adjacentPairs() -> AdjacentPairsCollection<Self> | ||
} | ||
``` | ||
|
||
The `AdjacentPairsSequence` type is a sequence, and the `AdjacentPairsCollection` type is a collection with conditional conformance to `BidirectionalCollection` and `RandomAccessCollection` when the underlying collection conforms. | ||
|
||
### Complexity | ||
|
||
Calling `adjacentPairs` is an O(1) operation. | ||
|
||
### Naming | ||
|
||
This method is named for clarity while remaining agnostic to any particular domain of programming. In natural language processing, this operation is akin to computing a list of bigrams; however, this algorithm is not specific to this use case. | ||
|
||
[naming]: https://forums.swift.org/t/naming-of-chained-with/40999/ | ||
|
||
### Comparison with other languages | ||
|
||
This function is often written as a `zip` of a sequence together with itself, minus its first element. | ||
|
||
**Haskell:** This operation is spelled ``s `zip` tail s``. | ||
|
||
**Python:** Python users may write `zip(s, s[1:])` for a list with at least one element. For natural language processing, the `nltk` package offers a `bigrams` function akin to this method. | ||
|
||
Note that in Swift, the spelling `zip(s, s.dropFirst())` is undefined behavior for a single-pass sequence `s`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,271 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// This source file is part of the Swift Algorithms open source project | ||
// | ||
// Copyright (c) 2021 Apple Inc. and the Swift project authors | ||
// Licensed under Apache License v2.0 with Runtime Library Exception | ||
// | ||
// See https://swift.org/LICENSE.txt for license information | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
extension Sequence { | ||
/// Creates a sequence of adjacent pairs of elements from this sequence. | ||
/// | ||
/// In the `AdjacentPairsSequence` returned by this method, the elements of | ||
/// the *i*th pair are the *i*th and *(i+1)*th elements of the underlying | ||
/// sequence. | ||
/// The following example uses the `adjacentPairs()` method to iterate over | ||
/// adjacent pairs of integers: | ||
/// | ||
/// for pair in (1...5).adjacentPairs() { | ||
/// print(pair) | ||
/// } | ||
/// // Prints "(1, 2)" | ||
/// // Prints "(2, 3)" | ||
/// // Prints "(3, 4)" | ||
/// // Prints "(4, 5)" | ||
@inlinable | ||
public func adjacentPairs() -> AdjacentPairsSequence<Self> { | ||
AdjacentPairsSequence(base: self) | ||
} | ||
} | ||
|
||
extension Collection { | ||
/// A collection of adjacent pairs of elements built from an underlying collection. | ||
/// | ||
/// In an `AdjacentPairsCollection`, the elements of the *i*th pair are the *i*th | ||
/// and *(i+1)*th elements of the underlying sequence. The following example | ||
/// uses the `adjacentPairs()` method to iterate over adjacent pairs of | ||
/// integers: | ||
/// ``` | ||
/// for pair in (1...5).adjacentPairs() { | ||
/// print(pair) | ||
/// } | ||
/// // Prints "(1, 2)" | ||
/// // Prints "(2, 3)" | ||
/// // Prints "(3, 4)" | ||
/// // Prints "(4, 5)" | ||
/// ``` | ||
@inlinable | ||
public func adjacentPairs() -> AdjacentPairsCollection<Self> { | ||
AdjacentPairsCollection(base: self) | ||
} | ||
} | ||
|
||
/// A sequence of adjacent pairs of elements built from an underlying sequence. | ||
/// | ||
/// In an `AdjacentPairsSequence`, the elements of the *i*th pair are the *i*th | ||
/// and *(i+1)*th elements of the underlying sequence. The following example | ||
/// uses the `adjacentPairs()` method to iterate over adjacent pairs of | ||
/// integers: | ||
/// ``` | ||
/// for pair in (1...5).adjacentPairs() { | ||
/// print(pair) | ||
/// } | ||
/// // Prints "(1, 2)" | ||
/// // Prints "(2, 3)" | ||
/// // Prints "(3, 4)" | ||
/// // Prints "(4, 5)" | ||
/// ``` | ||
public struct AdjacentPairsSequence<Base: Sequence> { | ||
@usableFromInline | ||
internal let base: Base | ||
|
||
/// Creates an instance that makes pairs of adjacent elements from `base`. | ||
@inlinable | ||
internal init(base: Base) { | ||
self.base = base | ||
} | ||
} | ||
|
||
extension AdjacentPairsSequence { | ||
public struct Iterator { | ||
@usableFromInline | ||
internal var base: Base.Iterator | ||
|
||
@usableFromInline | ||
internal var previousElement: Base.Element? | ||
|
||
@inlinable | ||
internal init(base: Base.Iterator) { | ||
self.base = base | ||
} | ||
} | ||
} | ||
|
||
extension AdjacentPairsSequence.Iterator: IteratorProtocol { | ||
public typealias Element = (Base.Element, Base.Element) | ||
|
||
@inlinable | ||
public mutating func next() -> Element? { | ||
if previousElement == nil { | ||
previousElement = base.next() | ||
} | ||
|
||
guard let previous = previousElement, let next = base.next() else { | ||
return nil | ||
} | ||
|
||
previousElement = next | ||
return (previous, next) | ||
} | ||
} | ||
|
||
extension AdjacentPairsSequence: Sequence { | ||
@inlinable | ||
public func makeIterator() -> Iterator { | ||
Iterator(base: base.makeIterator()) | ||
} | ||
|
||
@inlinable | ||
public var underestimatedCount: Int { | ||
Swift.max(0, base.underestimatedCount - 1) | ||
} | ||
} | ||
|
||
/// A collection of adjacent pairs of elements built from an underlying collection. | ||
/// | ||
/// In an `AdjacentPairsCollection`, the elements of the *i*th pair are the *i*th | ||
/// and *(i+1)*th elements of the underlying sequence. The following example | ||
/// uses the `adjacentPairs()` method to iterate over adjacent pairs of | ||
/// integers: | ||
/// ``` | ||
/// for pair in (1...5).adjacentPairs() { | ||
/// print(pair) | ||
/// } | ||
/// // Prints "(1, 2)" | ||
/// // Prints "(2, 3)" | ||
/// // Prints "(3, 4)" | ||
/// // Prints "(4, 5)" | ||
/// ``` | ||
public struct AdjacentPairsCollection<Base: Collection> { | ||
@usableFromInline | ||
internal let base: Base | ||
|
||
public let startIndex: Index | ||
|
||
@inlinable | ||
internal init(base: Base) { | ||
self.base = base | ||
|
||
// Precompute `startIndex` to ensure O(1) behavior, | ||
// avoiding indexing past `endIndex` | ||
let start = base.startIndex | ||
let end = base.endIndex | ||
let second = start == end ? start : base.index(after: start) | ||
self.startIndex = Index(first: start, second: second) | ||
} | ||
} | ||
|
||
extension AdjacentPairsCollection { | ||
public typealias Iterator = AdjacentPairsSequence<Base>.Iterator | ||
|
||
@inlinable | ||
public func makeIterator() -> Iterator { | ||
Iterator(base: base.makeIterator()) | ||
} | ||
} | ||
|
||
extension AdjacentPairsCollection { | ||
public struct Index: Comparable { | ||
@usableFromInline | ||
internal var first: Base.Index | ||
|
||
@usableFromInline | ||
internal var second: Base.Index | ||
|
||
@inlinable | ||
internal init(first: Base.Index, second: Base.Index) { | ||
self.first = first | ||
self.second = second | ||
} | ||
|
||
@inlinable | ||
public static func < (lhs: Index, rhs: Index) -> Bool { | ||
(lhs.first, lhs.second) < (rhs.first, rhs.second) | ||
} | ||
Comment on lines
+184
to
+187
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should only need to compare one of the two underlying indices here instead of both, and the same applies to |
||
} | ||
} | ||
|
||
extension AdjacentPairsCollection: Collection { | ||
@inlinable | ||
public var endIndex: Index { | ||
switch base.endIndex { | ||
case startIndex.first, startIndex.second: | ||
return startIndex | ||
case let end: | ||
return Index(first: end, second: end) | ||
} | ||
} | ||
Comment on lines
+192
to
+200
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I strongly suggest unconditionally representing |
||
|
||
@inlinable | ||
public subscript(position: Index) -> (Base.Element, Base.Element) { | ||
(base[position.first], base[position.second]) | ||
} | ||
|
||
@inlinable | ||
public func index(after i: Index) -> Index { | ||
let next = base.index(after: i.second) | ||
return next == base.endIndex | ||
? endIndex | ||
: Index(first: i.second, second: next) | ||
} | ||
|
||
@inlinable | ||
public func index(_ i: Index, offsetBy distance: Int) -> Index { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
if distance == 0 { | ||
return i | ||
} else if distance > 0 { | ||
let firstOffsetIndex = base.index(i.first, offsetBy: distance) | ||
let secondOffsetIndex = base.index(after: firstOffsetIndex) | ||
return secondOffsetIndex == base.endIndex | ||
? endIndex | ||
: Index(first: firstOffsetIndex, second: secondOffsetIndex) | ||
} else { | ||
return i == endIndex | ||
? Index(first: base.index(i.first, offsetBy: distance - 1), | ||
second: base.index(i.first, offsetBy: distance)) | ||
: Index(first: base.index(i.first, offsetBy: distance), | ||
second: i.first) | ||
Comment on lines
+226
to
+230
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could avoid computing the first and second indices separately here, since you can cheaply compute the firts if you already have the second. |
||
} | ||
} | ||
|
||
@inlinable | ||
public func distance(from start: Index, to end: Index) -> Int { | ||
let offset: Int | ||
switch (start.first, end.first) { | ||
case (base.endIndex, base.endIndex): | ||
return 0 | ||
case (base.endIndex, _): | ||
offset = +1 | ||
case (_, base.endIndex): | ||
offset = -1 | ||
default: | ||
offset = 0 | ||
} | ||
|
||
return base.distance(from: start.first, to: end.first) + offset | ||
} | ||
|
||
@inlinable | ||
public var count: Int { | ||
Swift.max(0, base.count - 1) | ||
} | ||
} | ||
|
||
extension AdjacentPairsCollection: BidirectionalCollection | ||
where Base: BidirectionalCollection | ||
{ | ||
@inlinable | ||
public func index(before i: Index) -> Index { | ||
i == endIndex | ||
? Index(first: base.index(i.first, offsetBy: -2), | ||
second: base.index(before: i.first)) | ||
: Index(first: base.index(before: i.first), | ||
second: i.first) | ||
} | ||
} | ||
|
||
extension AdjacentPairsCollection: RandomAccessCollection | ||
where Base: RandomAccessCollection {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is beneficial, mostly because we're precomputing
startIndex
. Won't this mean we end up doing some duplicate work when iterating oversomeCollection.adjacentPairs()
?