327 Commits

Author SHA1 Message Date
klondikedragon
82b04c8f43
Merge de86b79126d3fd4ea441450face8181300834ca4 into 6b43995a97dee4b2c7fc0bdff8e124da9f31a57e 2025-04-18 16:21:38 -06:00
klondikedragon
de86b79126
Merge pull request #7 from itlightning/syslog-formats
Support for RFC3164/RFC5424 syslog formats
2025-04-12 15:50:26 -06:00
Klondike Dragon
b058bd310c Add support for RFC3164/RFC5424 formats
Many devices send dates that do not conform to the RFCs...

Also add support for the strange "TZ-0700" variant of the "UTC-0700"
offset.

Cover all the changes with new tests.
2025-04-12 15:48:03 -06:00
klondikedragon
2c0ba64777
Merge pull request #6 from itlightning/enhance-dateparse-example
Enhance dateparse example with flags related to ambiguous formats
2025-04-12 12:53:05 -06:00
Klondike Dragon
3df9b0840b Fix latest lint errors 2025-04-12 12:51:50 -06:00
Klondike Dragon
ee6424ca5d Upgrade go dependencies to latest 2025-04-12 12:37:53 -06:00
Klondike Dragon
4e89335799 Upgrade github actions 2025-04-12 12:37:46 -06:00
Klondike Dragon
fe33a563db Add prefer-day-first flag and relatedto example program
Support --prefer-day-first and --retry-ambiguous flags for example
program. Also provide an example where parsing would fail without it.
2025-04-12 12:27:56 -06:00
klondikedragon
eeb01af691
Merge pull request #3 from itlightning/chore/2024-02-17-upgrade-deps
(chore) upgrade deps
2024-02-17 08:50:57 -07:00
Klondike Dragon
0ebde14994 (chore) upgrade deps 2024-02-17 08:41:46 -07:00
klondikedragon
9bda545e17
Merge pull request #2 from elliot40404/patch-1
fixed lib import
2024-02-17 08:22:05 -07:00
Elliot
3b41d24dbb
fixed lib import 2024-01-10 02:29:41 +05:30
Klondike Dragon
597b525a1a Fix goreleaser github action 2024-01-08 22:40:45 -07:00
klondikedragon
69f12a31e3
Merge pull request #1 from itlightning/prerelease/v0.1.0
Prepare v0.1.0 release
2024-01-08 22:04:55 -07:00
Klondike Dragon
c943d3c348 Fork package to github.com/itlightning/dateparse
Various other cleanup:
* Update README.md
* Update github workflows
* Add to copyright
* Add .gitignore
2024-01-08 21:59:42 -07:00
Klondike Dragon
d5b3c60e9b Cleanup handling of TZ name parsing
Fully support the format where a TZ name is in parentheses after the
time (and possibly after an offset). This fixes the broken case where a
4 character TZ name was in parentheses after a time.
2023-12-30 12:10:37 -07:00
Klondike Dragon
c4de5d4f6a Unify/fix timezone offset/name states
* Merge duplicate states (fixes lots of edge cases)
* Support for +00:00 is consistent with +0000 now
* Support (timezone description) after any offset/name
* Update tests to cover positive/negative cases
* Update example with new supported formats
2023-12-30 01:10:43 -07:00
Klondike Dragon
fd21b1ee3e Allow weekday prefix for most date formats
This is implemented now using the "skip" parser field, indicating
to skip the first N characters. This also avoids a recursive parse
in one more case (more efficient). This simplifies the state machine
a little bit, while the rest of the code needs to properly account
for the value of the skip field.

Also allow whitespace prefix without penalty.

Modify the test suite to psuedo-randomly add a weekday prefix
to the formats that allow it (all except the purely numeric ones).
2023-12-23 20:04:38 -07:00
Klondike Dragon
9f7bdf7101 Update go doc 2023-12-19 21:50:19 -07:00
Klondike Dragon
5cb27939bd Update benchmark results 2023-12-18 23:52:17 -07:00
Klondike Dragon
4d76f597be Fix ambiguous mm/dd that start with weekday
Options were not being properly passed to recursive parseTime call.
2023-12-18 23:40:08 -07:00
Klondike Dragon
4f7e8545ec Update example and README.md with new formats
Audited all test cases to make sure an example was listed for all known formats.
2023-12-18 23:19:16 -07:00
Klondike Dragon
65e6e8d1a9 Add support for dd-month-year format 2023-12-18 23:14:58 -07:00
Klondike Dragon
7a3c923820 Fix mm.dd.yyyy (time) format 2023-12-18 23:14:08 -07:00
Klondike Dragon
89df0f8c49 Comprehensive time validation 2023-12-18 20:52:16 -07:00
Klondike Dragon
a45d593447 Optimize checks for day of week and full month
Reduces CPU usage on large benchmarks by ~2%-3% and prepares for future with international month names in future.
2023-12-16 23:40:14 -07:00
Klondike Dragon
fbf07cc274 Optimize memory for error case
New option SimpleErrorMessages that avoids allocation in the error path. It's off by default to preserve backwards compatibility.

Added benchmark BenchmarkBigParseAnyErrors that takes the big set of test cases, and injects errors to make them fail at pseudo-random places.

This optimization speeds up the error path runtime by 4x and reduces error path allocation bytes by 13x!
2023-12-16 23:28:15 -07:00
Klondike Dragon
d2e1443c4d Comprehensive date format validation
Audit every stateDate so every unexpected alternative will fail.

In the process, fixed some newly found bugs:
* Extend format yyyy-mon-dd to allow times to follow it. Also allow full month name.
* Allow full day name before month (e.g., Monday January 4th, 2017)

Relevant confirmatory test cases were added.
2023-12-16 22:31:48 -07:00
Klondike Dragon
23f8fa1af0 Further optimize ambiguous parsing
Optimize the common and special case where mm and dd are the same length, just swap in place. Avoids having to reparse the entire string.

For this case, it's about 30% faster and reduces allocations by about 15%.

This format is especially common, hence the reason to optimize for this case.

Also fix the case for ambiguous date/time in the mm:dd:yyyy format.
2023-12-16 13:52:00 -07:00
Klondike Dragon
ed5310d0c1 Optimize ambiguous date parsing
Previously, for ambiguous date strings, it was always calling parse twice even when the first parse would have been successful.

Refactor so that parsing isn't re-attempted unless the first parse fails ambiguously.

Benchmark results show that with RetryAmbiguousDateWithSwap(true), it's now about 6.5% faster (ns/op) and reduces allocated bytes by 3.4%.
2023-12-16 12:55:03 -07:00
Klondike Dragon
f4307ef59d Heavily optimize memory allocations
Uses a memory pool for parser struct and format []byte

Uses a new go 1.20 feature to avoid allocations for []byte to string conversions in allowable cases.

go 1.20 also fixes a go bug for parsing fractional sec after a comma, so we can eliminate a workaround.

The remaining allocations are mostly unavoidable (e.g., time.Parse constructing a FixedZone location or part to strings.ToLower).

Results show an 89% reduction in allocated bytes for the big benchmark cases, and for some formats an allocation can be avoided entirely.

There is also a resulting 26% speedup in ns/op.

Details:

BEFORE:

cpu: 12th Gen Intel(R) Core(TM) i7-1255U
BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                    4736 B/op         42 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12               241422 B/op       2916 allocs/op
BenchmarkBigParseIn-12                244195 B/op       2984 allocs/op
BenchmarkBigParseRetryAmbiguous-12    260751 B/op       3715 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12             15903 B/op        200 allocs/op

AFTER:

BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                      48 B/op          2 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12                25394 B/op        824 allocs/op
BenchmarkBigParseIn-12                 28165 B/op        892 allocs/op
BenchmarkBigParseRetryAmbiguous-12     37880 B/op       1502 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12              3851 B/op        117 allocs/op
2023-12-16 10:48:24 -07:00
Klondike Dragon
0d2fd5e275 Add broader benchmarks
Uses the main test set for a broader stress test.
2023-12-16 08:18:54 -07:00
Klondike Dragon
0c3943eacd Support RabbitMQ log format (dd-mon-yyyy::hh:mm:ss)
Adapt https://github.com/araddon/dateparse/pull/122 by https://github.com/bizy01 to add support for RMQ log format. Refactor to avoid redundant code. Add format validations.

As a side note, will also support the format dd-mm-yyyy:hh:mm:ss.
2023-12-15 20:22:47 -07:00
Klondike Dragon
249dd7368c Support git log format (Thu Apr 7 15:13:13 2005 -0700)
Adapt commit 99d9682a1c from https://github.com/araddon/dateparse/pull/92 by https://github.com/jiangxin (merge timeWsYearOffset case and validate format)
2023-12-15 17:42:07 -07:00
Klondike Dragon
18ec8c69f6 Expand Chinese date format support
Inspired by https://github.com/araddon/dateparse/pull/132 from https://github.com/xwjdsh -- made this more general to all time formats that could follow, and added format validation.

Also include the related README.md touchup from https://github.com/araddon/dateparse/pull/136
2023-12-15 17:14:03 -07:00
Klondike Dragon
cc63421875 Support times after yyyy.mm.dd dates
Fix for this bug mentioned in https://github.com/araddon/dateparse/pull/134

Also, the other cases mentioned in this PR are not valid formats, so add them to the TestParseErrors test, to document that this is expected.
2023-12-14 23:47:31 -07:00
Klondike Dragon
23869f345e Add support for mm/dd/yyyy, hh:mm:ss
Incorporate PR https://github.com/araddon/dateparse/pull/156 from https://github.com/BrianLeishman and adapt to also validate the format
2023-12-14 23:14:26 -07:00
Klondike Dragon
14fb9398e4 Fix parsing for format (time) UTC[+-]NNNN
Fixes https://github.com/araddon/dateparse/issues/158
2023-12-14 22:57:42 -07:00
Klondike Dragon
d05b099ca6 Add better timezone explanation to README.md
How golang parses date strings with respect to time zones and locations can be really confusing. Document the key points that need to be understood to properly interpret the results of parsing arbitrary date strings, which may or may not have explicit time zone name or offset information include the parsed date string.
2023-12-14 00:00:36 -07:00
Klondike Dragon
2b3f700718 Handle format "date time (MST)"
Was unable to handle standalone timezone in parentheses before.

Also update tests to indicate expected timezone name for all tests that are parsed in a specific location.

With updated logic/fixes, add tests to verify:
* Fix https://github.com/araddon/dateparse/issues/71
* Fix https://github.com/araddon/dateparse/issues/72
2023-12-13 23:58:04 -07:00
Klondike Dragon
8f0059d6da Add tests to verify ambiguous cases
Test cases now validates the following is true:
* Fixed https://github.com/araddon/dateparse/issues/91
* Fixed https://github.com/araddon/dateparse/issues/28

(previous commits already addresses these issues, these tests ensure
that these issues remain fixed)
2023-12-12 23:40:07 -07:00
Klondike Dragon
df9ae2e32a Incorporate support for yyyymmddhhmmss.SSS
Incorporate PR https://github.com/araddon/dateparse/pull/144 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/143
2023-12-12 23:19:35 -07:00
Klondike Dragon
fc278d32da Incorporate support for dd-mm-yyyy (digit month)
Incorporate PR https://github.com/araddon/dateparse/pull/140 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/139

This also fixes https://github.com/araddon/dateparse/issues/155
(duplicate of issue 139)

PR is adapted to avoid duplicate code and validate format.
2023-12-12 23:07:11 -07:00
Klondike Dragon
18938f16ae Implement support for yyyy mon dd (2013 May 02)
Incorporate PR https://github.com/araddon/dateparse/pull/142 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/141
2023-12-12 21:42:48 -07:00
Klondike Dragon
301ffeee02 Add support for mon/dd/yyyy (Oct/31/1970) 2023-12-12 21:24:17 -07:00
Klondike Dragon
49f9259ee3 Add support for dd[th,nd,st,rd] Month yyyy
Incorporate PR https://github.com/araddon/dateparse/pull/128 from
https://github.com/krhubert to fix
https://github.com/araddon/dateparse/issues/127
2023-12-12 20:18:58 -07:00
Klondike Dragon
c62ed15d73 Support PMDT and AMT time zones
Also disallow PM and AM from being specified twice in the string.

Fixes https://github.com/araddon/dateparse/issues/149
2023-12-12 17:42:09 -07:00
Klondike Dragon
3ebc8bc635 Incorporate fix for dd.mm.yyyy format
Incorporates PR https://github.com/araddon/dateparse/pull/133 from https://github.com/mehanizm to fix https://github.com/araddon/dateparse/issues/129

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/105
2023-12-11 23:46:44 -07:00
Klondike Dragon
1b1e0b3d33 Add extensive format validation, bugfixes
* Don't just assume we were given one of the valid formats.
* Also consolidate the parsing states that occur after timePeriod.
* Add subtests to make it easier to see what fails.
* Additional tests for 4-char timezone names.
* Fix https://github.com/araddon/dateparse/issues/117
* Fix https://github.com/araddon/dateparse/issues/150
* Fix https://github.com/araddon/dateparse/issues/157
* Fix https://github.com/araddon/dateparse/issues/145
* Fix https://github.com/araddon/dateparse/issues/108
* Fix https://github.com/araddon/dateparse/issues/137
* Fix https://github.com/araddon/dateparse/issues/130
* Fix https://github.com/araddon/dateparse/issues/123
* Fix https://github.com/araddon/dateparse/issues/109
* Fix https://github.com/araddon/dateparse/issues/98
* Addresses bug in https://github.com/araddon/dateparse/issues/100#issuecomment-1118868154

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/94
2023-12-11 23:45:58 -07:00
Klondike Dragon
465140d619 Fix ineffective break statements 2023-12-08 18:31:28 -07:00