Commit Graph

197 Commits

Author SHA1 Message Date
Klondike Dragon
c4de5d4f6a Unify/fix timezone offset/name states
* Merge duplicate states (fixes lots of edge cases)
* Support for +00:00 is consistent with +0000 now
* Support (timezone description) after any offset/name
* Update tests to cover positive/negative cases
* Update example with new supported formats
2023-12-30 01:10:43 -07:00
Klondike Dragon
fd21b1ee3e Allow weekday prefix for most date formats
This is implemented now using the "skip" parser field, indicating
to skip the first N characters. This also avoids a recursive parse
in one more case (more efficient). This simplifies the state machine
a little bit, while the rest of the code needs to properly account
for the value of the skip field.

Also allow whitespace prefix without penalty.

Modify the test suite to psuedo-randomly add a weekday prefix
to the formats that allow it (all except the purely numeric ones).
2023-12-23 20:04:38 -07:00
Klondike Dragon
9f7bdf7101 Update go doc 2023-12-19 21:50:19 -07:00
Klondike Dragon
4d76f597be Fix ambiguous mm/dd that start with weekday
Options were not being properly passed to recursive parseTime call.
2023-12-18 23:40:08 -07:00
Klondike Dragon
4f7e8545ec Update example and README.md with new formats
Audited all test cases to make sure an example was listed for all known formats.
2023-12-18 23:19:16 -07:00
Klondike Dragon
65e6e8d1a9 Add support for dd-month-year format 2023-12-18 23:14:58 -07:00
Klondike Dragon
7a3c923820 Fix mm.dd.yyyy (time) format 2023-12-18 23:14:08 -07:00
Klondike Dragon
89df0f8c49 Comprehensive time validation 2023-12-18 20:52:16 -07:00
Klondike Dragon
a45d593447 Optimize checks for day of week and full month
Reduces CPU usage on large benchmarks by ~2%-3% and prepares for future with international month names in future.
2023-12-16 23:40:14 -07:00
Klondike Dragon
fbf07cc274 Optimize memory for error case
New option SimpleErrorMessages that avoids allocation in the error path. It's off by default to preserve backwards compatibility.

Added benchmark BenchmarkBigParseAnyErrors that takes the big set of test cases, and injects errors to make them fail at pseudo-random places.

This optimization speeds up the error path runtime by 4x and reduces error path allocation bytes by 13x!
2023-12-16 23:28:15 -07:00
Klondike Dragon
d2e1443c4d Comprehensive date format validation
Audit every stateDate so every unexpected alternative will fail.

In the process, fixed some newly found bugs:
* Extend format yyyy-mon-dd to allow times to follow it. Also allow full month name.
* Allow full day name before month (e.g., Monday January 4th, 2017)

Relevant confirmatory test cases were added.
2023-12-16 22:31:48 -07:00
Klondike Dragon
23f8fa1af0 Further optimize ambiguous parsing
Optimize the common and special case where mm and dd are the same length, just swap in place. Avoids having to reparse the entire string.

For this case, it's about 30% faster and reduces allocations by about 15%.

This format is especially common, hence the reason to optimize for this case.

Also fix the case for ambiguous date/time in the mm:dd:yyyy format.
2023-12-16 13:52:00 -07:00
Klondike Dragon
ed5310d0c1 Optimize ambiguous date parsing
Previously, for ambiguous date strings, it was always calling parse twice even when the first parse would have been successful.

Refactor so that parsing isn't re-attempted unless the first parse fails ambiguously.

Benchmark results show that with RetryAmbiguousDateWithSwap(true), it's now about 6.5% faster (ns/op) and reduces allocated bytes by 3.4%.
2023-12-16 12:55:03 -07:00
Klondike Dragon
f4307ef59d Heavily optimize memory allocations
Uses a memory pool for parser struct and format []byte

Uses a new go 1.20 feature to avoid allocations for []byte to string conversions in allowable cases.

go 1.20 also fixes a go bug for parsing fractional sec after a comma, so we can eliminate a workaround.

The remaining allocations are mostly unavoidable (e.g., time.Parse constructing a FixedZone location or part to strings.ToLower).

Results show an 89% reduction in allocated bytes for the big benchmark cases, and for some formats an allocation can be avoided entirely.

There is also a resulting 26% speedup in ns/op.

Details:

BEFORE:

cpu: 12th Gen Intel(R) Core(TM) i7-1255U
BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                    4736 B/op         42 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12               241422 B/op       2916 allocs/op
BenchmarkBigParseIn-12                244195 B/op       2984 allocs/op
BenchmarkBigParseRetryAmbiguous-12    260751 B/op       3715 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12             15903 B/op        200 allocs/op

AFTER:

BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                      48 B/op          2 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12                25394 B/op        824 allocs/op
BenchmarkBigParseIn-12                 28165 B/op        892 allocs/op
BenchmarkBigParseRetryAmbiguous-12     37880 B/op       1502 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12              3851 B/op        117 allocs/op
2023-12-16 10:48:24 -07:00
Klondike Dragon
0c3943eacd Support RabbitMQ log format (dd-mon-yyyy::hh:mm:ss)
Adapt https://github.com/araddon/dateparse/pull/122 by https://github.com/bizy01 to add support for RMQ log format. Refactor to avoid redundant code. Add format validations.

As a side note, will also support the format dd-mm-yyyy:hh:mm:ss.
2023-12-15 20:22:47 -07:00
Klondike Dragon
249dd7368c Support git log format (Thu Apr 7 15:13:13 2005 -0700)
Adapt commit 99d9682a1c from https://github.com/araddon/dateparse/pull/92 by https://github.com/jiangxin (merge timeWsYearOffset case and validate format)
2023-12-15 17:42:07 -07:00
Klondike Dragon
18ec8c69f6 Expand Chinese date format support
Inspired by https://github.com/araddon/dateparse/pull/132 from https://github.com/xwjdsh -- made this more general to all time formats that could follow, and added format validation.

Also include the related README.md touchup from https://github.com/araddon/dateparse/pull/136
2023-12-15 17:14:03 -07:00
Klondike Dragon
cc63421875 Support times after yyyy.mm.dd dates
Fix for this bug mentioned in https://github.com/araddon/dateparse/pull/134

Also, the other cases mentioned in this PR are not valid formats, so add them to the TestParseErrors test, to document that this is expected.
2023-12-14 23:47:31 -07:00
Klondike Dragon
23869f345e Add support for mm/dd/yyyy, hh:mm:ss
Incorporate PR https://github.com/araddon/dateparse/pull/156 from https://github.com/BrianLeishman and adapt to also validate the format
2023-12-14 23:14:26 -07:00
Klondike Dragon
14fb9398e4 Fix parsing for format (time) UTC[+-]NNNN
Fixes https://github.com/araddon/dateparse/issues/158
2023-12-14 22:57:42 -07:00
Klondike Dragon
2b3f700718 Handle format "date time (MST)"
Was unable to handle standalone timezone in parentheses before.

Also update tests to indicate expected timezone name for all tests that are parsed in a specific location.

With updated logic/fixes, add tests to verify:
* Fix https://github.com/araddon/dateparse/issues/71
* Fix https://github.com/araddon/dateparse/issues/72
2023-12-13 23:58:04 -07:00
Klondike Dragon
df9ae2e32a Incorporate support for yyyymmddhhmmss.SSS
Incorporate PR https://github.com/araddon/dateparse/pull/144 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/143
2023-12-12 23:19:35 -07:00
Klondike Dragon
fc278d32da Incorporate support for dd-mm-yyyy (digit month)
Incorporate PR https://github.com/araddon/dateparse/pull/140 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/139

This also fixes https://github.com/araddon/dateparse/issues/155
(duplicate of issue 139)

PR is adapted to avoid duplicate code and validate format.
2023-12-12 23:07:11 -07:00
Klondike Dragon
18938f16ae Implement support for yyyy mon dd (2013 May 02)
Incorporate PR https://github.com/araddon/dateparse/pull/142 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/141
2023-12-12 21:42:48 -07:00
Klondike Dragon
301ffeee02 Add support for mon/dd/yyyy (Oct/31/1970) 2023-12-12 21:24:17 -07:00
Klondike Dragon
49f9259ee3 Add support for dd[th,nd,st,rd] Month yyyy
Incorporate PR https://github.com/araddon/dateparse/pull/128 from
https://github.com/krhubert to fix
https://github.com/araddon/dateparse/issues/127
2023-12-12 20:18:58 -07:00
Klondike Dragon
c62ed15d73 Support PMDT and AMT time zones
Also disallow PM and AM from being specified twice in the string.

Fixes https://github.com/araddon/dateparse/issues/149
2023-12-12 17:42:09 -07:00
Klondike Dragon
3ebc8bc635 Incorporate fix for dd.mm.yyyy format
Incorporates PR https://github.com/araddon/dateparse/pull/133 from https://github.com/mehanizm to fix https://github.com/araddon/dateparse/issues/129

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/105
2023-12-11 23:46:44 -07:00
Klondike Dragon
1b1e0b3d33 Add extensive format validation, bugfixes
* Don't just assume we were given one of the valid formats.
* Also consolidate the parsing states that occur after timePeriod.
* Add subtests to make it easier to see what fails.
* Additional tests for 4-char timezone names.
* Fix https://github.com/araddon/dateparse/issues/117
* Fix https://github.com/araddon/dateparse/issues/150
* Fix https://github.com/araddon/dateparse/issues/157
* Fix https://github.com/araddon/dateparse/issues/145
* Fix https://github.com/araddon/dateparse/issues/108
* Fix https://github.com/araddon/dateparse/issues/137
* Fix https://github.com/araddon/dateparse/issues/130
* Fix https://github.com/araddon/dateparse/issues/123
* Fix https://github.com/araddon/dateparse/issues/109
* Fix https://github.com/araddon/dateparse/issues/98
* Addresses bug in https://github.com/araddon/dateparse/issues/100#issuecomment-1118868154

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/94
2023-12-11 23:45:58 -07:00
Klondike Dragon
465140d619 Fix ineffective break statements 2023-12-08 18:31:28 -07:00
Arran Ubels
01b692d1ce
Another case. 2023-02-16 09:39:34 +11:00
Arran Ubels
b0b5409675
Unused code 2023-02-15 23:37:27 +11:00
Arran Ubels
8b765a5302
Skip white space 2023-02-15 23:37:23 +11:00
Arran Ubels
b1fd89e43f
The only required one. 2023-02-15 16:31:37 +11:00
Arran Ubels
3a32cbb3d2
All of these did nothing 2023-02-15 16:28:56 +11:00
Arran Ubels
bf3a5b3040
Skip white space - to delete strategically 2023-02-15 16:27:43 +11:00
Arran Ubels
268a690081
So people don't have to check the string they can use the new errors.Is function 2023-02-15 16:26:18 +11:00
Arran Ubels
53a8cbdf09
Unnecessary bracket 2023-02-15 16:10:45 +11:00
Arran Ubels
094aad3f21
Commented code 2023-02-15 16:09:32 +11:00
Arran Ubels
eabb56b497
Text should be lowercase 2023-02-15 16:04:58 +11:00
Arran Ubels
515cd81767
S1023: redundant break statement (gosimple) 2023-02-15 16:02:13 +11:00
Arran Ubels
e654ac7b35
Bug fixes. 2023-02-15 15:56:17 +11:00
Arran Ubels
57a1767ebd
SA4006: this value of err is never used (staticcheck) 2023-02-15 15:51:48 +11:00
Arran Ubels
5143d47e3e
S1023: redundant break statement (gosimple) 2023-02-15 15:51:14 +11:00
Arran Ubels
14cb70eacb
field offsetlen is unused (unused) 2023-02-15 15:50:03 +11:00
Arran Ubels
5335e6fe23
Error return value is not checked (errcheck) 2023-02-15 15:49:18 +11:00
radaiming
5dd51ed0f7 Fix possible panic 2021-04-28 23:23:48 +08:00
Aaron Raddon
0eec95c9db New date format 2020-07-20+00:00 fixes #110 2021-02-06 16:14:29 -08:00
Aaron Raddon
36fa6fb41d fix TZ-location override for format fixes #113 2021-02-06 15:23:24 -08:00
Aaron Raddon
0360d1282f Support len 2 TZ offsets: 2019-05-29T08:41-04 fixes #111 2021-02-06 13:42:06 -08:00