-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String list new #311
String list new #311
Conversation
The API for the module to manipulate lists of strings has been discussed and this has resulted in the current implementation
There was a typo in some comments - corrected
Thank you for pushing this PR forward. Note #320 by @awvwgk that proposes a non-fancy stdlib should provide a uniform and consistent model of strings, whatever that model may look like. So if there's a It seems to me that the |
@epagone wrote:
I must admit, when doing a preliminary review of this pull request I was confused by the |
Not @arjenmarkus, but the same train of thought emerged from discussion with @epagone in #268 (comment). I guess it just depends which PR (#311 or #320) reaches maturity first. As long as the behavior of the |
I am not sure I understand this. The two objects are intended to be
convenient offsets into the lists and as such are part of the public API
and you can use them for any list. They translate to integer indices
specific for the list in question.
@epagone, can you elaborate? How would you encapsulate list_end with the
derived type? I thought I solved your critique by not using special integer
values but instead independent derived types (actually there are only two
objects of that type, as the type itself is private). Perhaps I
misunderstood what you suggested.
Op ma 15 feb. 2021 om 11:58 schreef Ivan Pribec <[email protected]>:
… @epagone <https://github.com/epagone> wrote:
For example, I did raise a related issue
<#269 (comment)>
while reviewing the string list PR (together with other observations) where
the namespace was polluted with a list_end variable and I suggested to
encapsulated it within the derived type. Well, although we got to string
list "new" <#311>, well, let's
say that my comment did not get much traction
<https://github.com/fortran-lang/stdlib/blob/90b06ff89ac3dce00e6c1d2f770beca38b837e34/src/stdlib_stringlist.f90#L26>
smiley
I must admit, when doing a preliminary review of this pull request I was
confused by the list_head and list_end derived-type instances. Are these
two objects part of the public API and are they supposed to be reused for
multiple lists simultaneously?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR4J7N55PHCPQ5XZLT3S7D45JANCNFSM4XFADDSQ>
.
|
The behaviour of stringlist_type should be independent of the actual data
type - as long as the required operations are supported.
I do have an API question, however: do we want to mix ordinary strings and
the string_type or do we require strings to be presented as string_types
only? In other words:
Is this allowed (with cnversion to string_type under the hood):
call list%insert( 1, "A new string" )
Or do we require:
call list%insert( 1, string_type("A new string") )
Op ma 15 feb. 2021 om 12:10 schreef Ivan Pribec <[email protected]>:
… stdlib should provide a uniform and consistent model of strings, whatever
that model may look like. So if there's a string_type that provides the
string essentials, stringlist_type should be a list of string_type
instances. Do you agree?
Not @arjenmarkus <https://github.com/arjenmarkus>, but the same train of
thought emerged from discussion with @epagone <https://github.com/epagone>
in #268 (comment)
<#268 (comment)>
.
I guess it just depends which PR (#311
<#311> or #320
<#320>) reaches maturity
first. As long as the behavior of the stringlist_type is not coupled
tightly to the string objects in the list, I don't see any problems.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YRYRJVGFK7GS3Y2VFETS7D6K3ANCNFSM4XFADDSQ>
.
|
Thanks @arjenmarkus for the explanation, it is consistent with my understanding of the desired behavior. I am impressed by the smart design, that a single
Does the |
My gut feeling is that if the non-fancy string type in #320 is accepted we should allow both. This could imply using the
Perhaps the more interesting question is whether |
Preferably we return a fixed length character whenever possible, |
@arjenmarkus I very much appreciate that you considered my previous comments (and I think that you also understood my comments correctly). I think that your design to make a "generic" Ideally, I would like to use this feature of
PS: I still cannot bite my tongue and avoid saying that I do not like that we can create "holes" in the list with acceptable "out-of-bounds" indices (unless I have misunderstood your design again 😅 ). |
FWIW, I'd prefer the first option. |
I think the idea of the proposal is that an "out-of-bounds" access is simply translated to an empty string. |
:), no, the current design silently interprets an out-of-bounds index to be
at either end, so no holes anymore.
Op ma 15 feb. 2021 om 13:37 schreef Emanuele Pagone <
[email protected]>:
… I am not sure I understand this. The two objects are intended to be
convenient offsets into the lists and as such are part of the public API
and you can use them for any list. They translate to integer indices
specific for the list in question.
@epagone <https://github.com/epagone>, can you elaborate? How would you
encapsulate list_end with the derived type? I thought I solved your
critique by not using special integer values but instead independent
derived types (actually there are only two objects of that type, as the
type itself is private). Perhaps I misunderstood what you suggested.
@arjenmarkus <https://github.com/arjenmarkus> I very much appreciate that
you considered my previous comments (and I think that you also understood
my comments correctly). I think that your design to make a "generic"
list_end derived type might be too clever for some less-CS-literate
Fortraneers like me: I did not understand that it was meant to be valid for
any list. It might also confuse some users that might be wondering where
list_start or list_head is (the latter is private instead, IIUC).
Ideally, I would like to use this feature of stdlib according to two
alternative scenarios.
1. This module will become a dependency (or an extension?) of a
generic list module;
2. list_end is an integer encapsulated in type(stringlist_type).
PS: I still cannot bite my tongue and avoid saying that I do not like that
we can create "holes" in the list with acceptable "out-of-bounds" indices
(unless I have misunderstood your design again 😅 ).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR6OYZBUEX7MUHT4UTTS7EIQBANCNFSM4XFADDSQ>
.
|
Accessing an out-of-bounds element is indeed interpreted as an empty
string, but you cannot insert a string in, say, position 100, if there are
only 3 strings filled. It would become string no. 4.
Op ma 15 feb. 2021 om 13:42 schreef Ivan Pribec <[email protected]>:
… PS: I still cannot bite my tongue and avoid saying that I do not like that
we can create "holes" in the list with acceptable "out-of-bounds" indices
(unless I have misunderstood your design again sweat_smile ).
I think the idea of the proposal is that an "out-of-bounds" access is
simply translated to an empty string.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR26EVJIDQNZ5QGSX4TS7EJB5ANCNFSM4XFADDSQ>
.
|
Are we sure that we want to be limited by GCC 8 bugs? I understand it's not a super-old release, but considering that |
This means the current implementation allows to append at the end of a string list by: call list%insert(huge(0), "value") |
Understood, thanks. I like this solution better than the previous one, but I still think that this opens the door to hard to find bugs when we do not enforce a strict check on bounds. |
According to this answer (what is the biggest array size for double precision in Fortran 90?):
The actual array size seems to be limited by the fact, the size of the array in bytes must fit in a Of course the amount of available RAM is the second limitation which enters into consideration, particularly when the |
I just want to note that using A cyclic indexing system could address this problem
|
Should the (Alternatively one could pursue an |
Thank you. IMO this is the best option for its flexibility. |
Would this match the Python semantics as mentioned in #268 (comment) by @certik? |
I made a mistake in the comment above, sorry about that, ignore the list there, I made some clearer tables below. Python is using a zero based indexing in the forward direction and one based indexing in the backwards direction. This approach does not allow to access beyond the first item with the forward index and also you cannot access after the last element with the backwards index:
For Fortran we would use a one based forward index and maybe also a one based backwards index, but we will have one index leftover in this approach, which is zero:
The cyclic indexing I wanted to propose was using all possible indices with:
The one based indexing for both forward and backwards direction looks more intuitive, the only question is what do we do with the zero in this approach? |
I greatly prefer to use such an optional argument above the use of a
function.
Op ma 15 feb. 2021 om 17:45 schreef Emanuele Pagone <
[email protected]>:
… Should the insert and append provide an optional stat argument to query
for allocation errors?
Thank you. IMO this is the best option for its flexibility.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR22VGAY5COCVMPP2SLS7FFQ5ANCNFSM4XFADDSQ>
.
|
Hm, my first implementation did something along these lines, but it led to
ambiguities. Hence the introduction of a separate type to represent the
start and stop indices (and a simple arithmetic)
Op ma 15 feb. 2021 om 20:19 schreef Ivan Pribec <[email protected]>:
… A cyclic indexing system could address this problem
- 0, -len(list)-1: before the first element of the list
- 1, -len(list): first element of the list
- len(list), -2: last element of the list
- len(list)+1, -1: after the last element of the list
Does this match with the Python semantics as mentioned in #268 (comment)
<#268 (comment)>
by @certik <https://github.com/certik>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR35OJIDYECH4XXHSPTS7FXV7ANCNFSM4XFADDSQ>
.
|
Could type(stringlist_type) :: list
associate(end => list%end()) ! list_end points to a private stringlist_index_t instance
! append to end of the list
call list%insert(end,"one")
call list%insert(end,"two")
call list%insert(end,"three")
! prepend using an integer position index (in this case head)
call list%insert(0,"zero")
! Print values "one", "two", "three"
do i = 2, end%as_int() ! ideally, this would be a protected member component, i.e. `end%idx`
print *, list%get(i)
end do
end associate Edit: I have devised an integer list as proof of concept. It appears to work correctly with Intel Fortran, but with gfortran (gcc version 10.1.0), the associate construct does not work correctly. The code can be found at: https://gist.github.com/ivan-pi/e2181b4335a1d93cbeb3d422c6732cfa Returned results
|
call append_stringlist%insert( list_after_end, slist ) | ||
end function append_stringlist | ||
|
||
function append_stringarray( list, sarray ) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
With the |
Sure, it would be a straightforward replacement of the type that is now
used internally.
Op za 13 mrt. 2021 om 13:00 schreef Sebastian Ehlert <
***@***.***>:
… With the string_type merged, can this patch make use of the newly
available functionality?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YRY3C6WYFUXJLVCR42DTDNHWNANCNFSM4XFADDSQ>
.
|
@aman-godara, could you elaborate? I am not quite seeing what you mean ;).
Op do 11 mrt. 2021 om 15:49 schreef Aman Godara ***@***.***>:
… ***@***.**** commented on this pull request.
------------------------------
In src/stdlib_stringlist.f90
<#311 (comment)>:
> + type(stringlist_type) :: prepend_string
+
+ prepend_string = list
+ call prepend_string%insert( list_head, string )
+end function prepend_string
+
+function append_stringlist( slist, list )
+ type(stringlist_type), intent(in) :: list
+ type(stringlist_type), intent(in) :: slist
+ type(stringlist_type) :: append_stringlist
+
+ append_stringlist = list
+ call append_stringlist%insert( list_after_end, slist )
+end function append_stringlist
+
+function append_stringarray( list, sarray )
in my opinion it will be better to have arguments as ( sarray , list ) so
that it becomes coherent with the previous functions convention
(append_stringlist, prepend_string, etc.).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR4XZO776ACL2K76KF3TDDJ6JANCNFSM4XFADDSQ>
.
|
This comment has been minimized.
This comment has been minimized.
Yes, that is right. That is a general feature with functions that return
values, not pointers.
Op wo 31 mrt. 2021 om 12:48 schreef Aman Godara ***@***.***>:
… ***@***.**** commented on this pull request.
------------------------------
In src/stdlib_stringlist.f90
<#311 (comment)>:
> + string_greater = string1%value > string2%value
+end function string_greater
+
+elemental logical function string_equal( string1, string2 )
+ type(string_type), intent(in) :: string1
+ type(string_type), intent(in) :: string2
+
+ string_equal = string1%value == string2%value
+end function string_equal
+
+function append_string( list, string )
+ type(stringlist_type), intent(in) :: list
+ character(len=*), intent(in) :: string
+ type(stringlist_type) :: append_string
+
+ append_string = list
I have a query in this line:
I think this will create a copy (deep copy) of list which will be
assigned to append_string here. Have I got it right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR4PGSRISRY3CJUXKRLTGL4XRANCNFSM4XFADDSQ>
.
|
In Fortran the result of a function is contained in a variable with the
name of that function, unless you use the clause "result(varname)". So you
do not return the value as with C-like languages:
myResult = ...;
return myResult;
but instead:
myResult = ...
return ! If you really want to return explicitly
|
Oops, I should have made sure "myResult" is the result variable ;).
So:
real function myResult( ... )
... declare arguments
myResult = ...
end function myResult
Or:
function intriguingCalculation( ... ) result(myResult)
... declare arguments
real :: myResult
myResult = ...
end function intriguingCalculation
end
|
What is the current status of this patch? I marked it as waiting for changes, but maybe this is incorrect and we should rather look for more reviewers. @arjenmarkus, what do you think? |
It has fallen between the cracks :(. I will have to look at it again,
especially the possibility to use string_type in stead of the builtin
character string. Thanks for reminding me!
Op ma 31 mei 2021 om 10:32 schreef Sebastian Ehlert <
***@***.***>:
… What is the current status of this patch? I marked it as waiting for
changes, but maybe this is incorrect and we should rather look for more
reviewers. @arjenmarkus <https://github.com/arjenmarkus>, what do you
think?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR54PBOTVQNEOIKZIM3TQNCSBANCNFSM4XFADDSQ>
.
|
First step to pick it up again was to use stdlib's string_type module
instead of my home-grown version. Mostly works now, but I discovered a few
oddities in the test programs. I must have overlooked them before - will
have to use stdlib_error's check to aoid this in future :). Hopefully it is
indeed just the test programs that are wrong.
Op ma 31 mei 2021 om 16:46 schreef Arjen Markus ***@***.***>:
… It has fallen between the cracks :(. I will have to look at it again,
especially the possibility to use string_type in stead of the builtin
character string. Thanks for reminding me!
Op ma 31 mei 2021 om 10:32 schreef Sebastian Ehlert <
***@***.***>:
> What is the current status of this patch? I marked it as waiting for
> changes, but maybe this is incorrect and we should rather look for more
> reviewers. @arjenmarkus <https://github.com/arjenmarkus>, what do you
> think?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#311 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAN6YR54PBOTVQNEOIKZIM3TQNCSBANCNFSM4XFADDSQ>
> .
>
|
This is a bit of confusing. We have |
This might help: #399 (comment). |
else | ||
do i = idxn, list%size | ||
inew = i + number | ||
call move_alloc( list%string(i)%value, list%string(inew)%value ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move_alloc(FROM, TO)
@arjenmarkus if I have understood it correctly what happens if there is already a string at index inew
(TO
)?
If it gets overwritten we will lose what was stored at index inew
.
for example:
list = ["1", "2", "3", "4", "5", "6", "7", "8"]
and something like this executes:
do i = 3, 8
inew = i + 3
call move_alloc(list%string(i)%value , list%string(inew)%value)
Then I will get
["1", "2", _, _, _, _, _, _, "3", "4", "5"]
_
represents that it is deallocated.
Closed in favor of #470 |
No description provided.